9

advertisement
July, 1968
Report ESL-R-355
Copy
9
RAPID PRE -INDEXING BY MACHINE
by
William R. Kampe II
The work described in this document was performed as part of Project Intrex under Research Grant NSFC-472 (Part) awarded to the
Massachusetts Institute of Technology by the National Science Foundation and the Advanced Research Projects Agency of the Department
of Defense. This grant is designated as M.I.T. DSR Project No.
70054.
Electronic
Department of
Massachusetts
Cambridge,
Systems Laboratory
Electrical Engineering
Institute of Technology
Massachusetts 02139
I
i,
FOREWORD
Except for minor editorial changes, this report is the thesis submitted by Mr. William R. Kampe II to the Electrical Engineering
Department, Massachusetts Institute of Technology, in partial fulfillment of the requirements for the degree of Master of Science. A
few alterations in the original wording have been made throughout
the text in an effort to enhance clarity, and several pages have been
reformatted; otherwise the manuscript remains as submitted.
J. F.
Reintjes
Professor of Electrical Engineering
{V
ABSTRACT
This report describes the development of a new method of subject indexing by machine for documents in the Project INTREX catalog. The
purpose of the system is to allow new documents to be placed online
quickly in the computer-stored INTREX catalog. The system that is
developed makes use of human-generated subject terms of existing
INTREX documents as a basis for generating index terms for new
documents. The pre--indexing system operates on only the title and
abstract of a document in generating a pre--index for the document.
The analysis of documents already containing human-generated subject indexes consisted of comparing the titles and abstracts of the
documents to their subject indexes. A large dictionary with data
about word usage was obtained from these comparisons. The dictionary served as a guide for the later pre -indexing of new documents.
Three variations of the automatic pre -indexing method have been de veloped, tested, and evaluated. Two methods show promise for
operational use in the INTREX system.
v
ACKNOWLEDGMENT
I am greatly indebted to Professor J. Francis Reintjes, thesis supervisor, for his guidance, suggestions, and critical evaluation of all
stages of this research. I thank Peter Kugel for his continued encouragement and for his endless patience in discussing new technical
approaches. Thanks are also expressed to Messrs. Robert Kusik,
Richard Marcus, and Alan Benenfeld for their explanations of the
Project INTREX cataloging proce s s.
In addition, I am grateful to Professor Lynwood Bryant, who spent
much time studying the manuscript and making pertinent suggestions
for improvement. I also thank the Publications and Drafting Personnel of the Electronic Systems Laboratory for their efforts in typing
and assembling the final document.
This research was supported by the Department of Electrical Engineering and by Project INTREX. Project INTREX is supported
through grants from the Carnegie Corporation and the Council on
Library Resources, Inc., and through Contract NSFC-472 from the
National Science Foundation and Advanced Research Projects Agency.
vi
CONTENTS
CHAPTER I
CHAPTER II
THE NEED FOR PRE-INDEXING
page
The INTREX Environment
1
Motivation for Pre-Indexing
4
Plan of the Research
4
PHASE I - DATA GATHERING
6
The Analysis of Documents Already
Having Subject Indexes
CHAPTER III
CHAPTER IV
CHAPTER V
1
8
Building a Dictionary
12
Possible Pre-Indexing Criteria
14
THE DEVELOPMENT OF PRE-INDEXING
SCHEMES
18
The Form of the Title and Abstract for
Pre-Indexing
18
Three Methods of Pre-Indexing
18
Method I - Usage Rate Only
20
Method II - Usage Rate and Frequency
Thresholds
20
Method III - Context Decisions
20
Testing the Pre-Indexing Methods
23
RESULTS OF PRE-INDEXING TRIALS
24
Subjective Evaluation of the Informational
Content of Pre-Indexes
24
The Role of Function Words
27
The Effect of Varying the Usage Rate
Threshold
28
The Effect of New Words on Pre-Indexing
29
The Completeness-Relevance Trade-off
30
The Effect of Dictionary Size
30
CONCLUSIONS AND SUGGESTIONS FOR
FURTHER RESEARCH
32
Streamlining the Pre-Indexing System
33
Applicability of Pre-Indexing to Other
Subject Areas
34
Suggestions for Further Research
34
The Possibility of Using Word Stemming
35
vii
-~~~~~~~~~~~~~~~~~~~
-
*-·----~-------
---
--
~I-~~
CONTENTS (Contd.)
APPENDIX A
APPENDIX B
APPENDIX C
BREAKDOWN OF WORDS IN DICTIONARY
BY USAGE RATE AND FREQUENCY
RESULTS OF PRE-INDEXING TRIALS
page 37
38
Method I
38
Method II
39
Method III
40
PROGRAM LISTINGS AND STRUCTURE
41
Phase I
41
Phase II
51
BIBLIOGRAPHY
61
viii
LIST OF FIGURES
1.
The Pre-Indexing System
2.
The Target Set of Words for Pre-Indexing (Shaded Area)
7
3.
The Pre-Index
7
4.
Typical Subject Index
9
5.
Typical Title and Abstract
9
6.
TA List and SI List
10
7.
Size of Dictionary as a Function of the Number of
Documents Analyzed
13
Set of Words Included in the Dictionary Data Base
(Shaded Area)
13
Frequency of Appearance for Words in the Dictionary
Based on 80 Document Records
14
Average Word Usage Rate as a Function Frequency
of Appearance
16
11.
Word Inclusion Region for Method I
21
12.
Word Inclusion for Method II
21
13.
Word Zones for Method III
21
14.
Results of Method I for 30 Documents
25
15.
Results of Method II for 30 Documents
25
16.
Results of Method III for 30 Documents
26
17.
Comparison of Average Results for all Methods on
30 Documents
26
8.
9.
10.
page
ix
4
I
CHAPTER I
THE NEED FOR PRE-INDEXING
This research involves an investigation of the feasibility of using
automatic machine-generated subject indexes as a possible means of
expediting the cataloging process for new documents to be added to the
Project INTREX library.
An online library in a fast-changing technical
field needs a method of putting information about a new document into
the library system quickly.
A possible method is to generate an auto-
matic pre-index from only the title and abstract of a new document.
The
pre-index is intended to serve as a temporary subject index until human
catalogers can replace the pre-index with a full subject index based on
an examination of the entire new document.
The pre-indexing schemes investigated are based on past experience gained from a computer analysis of the human-generated indexes
of a set of documents.
The experience thus obtained through comparison
of title and abstracts to corresponding subject indexes is used as a guide
for generating, by machine, a pre-index for a new document to be added
to the system.
The pre-indexing scheme being proposed differs from
previous work in automatic indexing in that the pre-indexing scheme
makes extensive use of prior human indexing in an automatic fashion.
The INTREX Environment
Project INTREX (Information Transfer Experiments) seeks to
exploit the multiaccess computer operated in an online mode as a basis
for a machine-stored library.
The INTREX collection will consist of
approximately 10, 000 documents in selected fields of materials science
and engineering.
Each document will be cataloged in depth and the cata-
log will be stored in a multiaccess computer.
Full text of each docu-
ment will be stored on microfilm and made accessible under computer
control.
A user of the library will be able to search the catalog by
means of a time-shared computer system.
Special remote consoles will
allow him to obtain displays of the full text of documents,
if he so desires.
or hard copies
At present,
INTREX is still in the developmental stage.
are no users of the system, as yet.
Nevertheless,
There
over 500 documents
have been fully cataloged and placed into the computer data base, accessible to the INTREX programmers on a time-sharing computer system.
It is this data base which is used for this research effort.
The INTREX
programmers have written numerous programs to facilitate access to
the data base.
The cataloging record for each document of the collection comprises many of the standard items found on the typical catalog cards of
conventional libraries.
The title and author, publication date, publisher,
and number of pages are included.
In addition,
an INTREX record for a
document includes information such as the language in which the document is written, the abstract, and a subject index.
Each particular type
of information is placed in its own field (title, author, publisher, catalog
number, and so forth) in a document record.
Most fields in a document
record can be obtained in a purely routine manner.
This information
comes with the docuiment and needs only to be transferred to the document record.
The abstract, when included in the record, also arrives
with the document and has been written by the author or an abstracting
agency.
Nearly all documents of the INTREX collection include an
abstract.
If not, the cataloger substitutes an excerpt.
In a document record, the one field which requires analysis of the
document in order to be generated is the subject index.
ject index is of prime interest in pre-indexing,
in some detail.
Since the sub-
it will be examined here
Of 49 possible descriptive fields being used by INTREX
for a document, the process of generating the subject index field alone
consumes over half of an indexer's time even for short journal articles.
For longer documents,
time.
the subject index takes a larger part of the total
The purpose of pre-indexing by machine is to relieve the indexers
of the pressing nature of the burden of writing the subject index for the
document.
In generating the subject index for a document, the indexers may
draw upon the material of the full document, including the title, abstract,
and text.
The subject index comprises several subject terms which
describe the content If the document.
Each subject term may range in
length from a single word to one or more noun phrases containing
-3several words.
To every subject term, the indexer assigns a relevancy
weight, which indicates how the term is used and how important it is in
the document.
This weighting system is a key feature of Project INTREX,
since it is hoped that the weights will allow more accurate retrieval of
pertinent documents.
A subject term may be labeled with any of five weights,
from 0 through 4.
The numbers 0 and 4 have special significance,
whereas the numbers
document.
numbered
1 through 3 indicate specific subject matter of the
A weight of 1 indicates that the subject term describes the
primary topic of the document;
weight (2) designates a secondary topic;
and weight (3),a much less central topic.
The special weight of 4 is
assigned to terms representing mathematical tools, instrumental tools,
or applications which are cited in the document but which are not central
to it.
The weight (0) designates a generic class to which the subject
matter of a document belongs.
Knowledge of the indexing behavior of the INTREX catalogers provides support for the concept of pre-indexing.
The catalogers for Proj-
ect INTREX are trained and competent in matters of library science.
Nevertheless,
they are not trained as subject experts in the area of
materials science, which is the area of the INTREX document collection.
This fact may influence the subject indexes which the indexers generate.
Although in writing a subject index, the indexer may use any word that
she desires in describing the document, the majority of words used will
be those employed by the author of the document.
It is natural that an
indexer who is not a subject expert should adopt the language of the
author in indexing.
Since the title and abstract of a document usually
provide an excellent indication of the subject matter of that document, a
large number of the subject terms of a subject index are borrowed
directly from the title and abstract of the document.
Thus, the indexers perform derivative indexing.
indexing,
In pure derivative
the words selected to portray the subject matter of a document
are only those words actually used by the author.
Salton at Cornell
University' has experimented with automatic derivative indexing from
Salton, G. (ed), "Information Storage and Retrieval, " Scientific Report
No. ISR-11, Department of Computer Science, Cornell University,
Ithaca, New York, June, 1966.
-4His results are very encouraging to the notion
title and abstract only.
Salton found that automatic derivative indexing from
of pre-indexing.
the title and abstract only is very nearly as good as automatic indexing
from full text.
Motivation for Pre-Indexing
Two factors combine to indicate the possibility of pre-indexing.
One factor indicates that pre-indexing is desirable;
that it is feasible.
the other factor,
First, the indexers spend a large part of their time
in writing the subject index.
It would be desirable if this process could
be bypassed, at least temporarily,
available to INTREX users quickly.
in order to make the document
Second, Salton's work and an ex-
amination of typical subject indexes reveals that pre-indexing is feasible.
Since the title and abstract will be placed with the document index into
an automatic method for selecting pertinent
the computer data base,
words or phrases for a document could generate a pre-index quickly for
the document before the subject indexing is completed.
Plan of the Research
Baskcally, the investigation of pre-indexing divides into two rather
distinct phases,
1 shows.
as Fig.
In Phase I, a number of document
Dictionary
and Data
\ Look-ups
Indexed
Documents
Documents
reProc ess
G~atherin
Indexing
a
/
Phase I
Phase 11
New
Document
Fig. 1 The Pre-lndexing System
Pre-Index
-5records is automatically analyzed by computer in order to build a dictionary and data base for later use in actual pre-indexing.
Further, the
analysis is intended to provide clues for designing the pre-indexing
methods to be used in Phase II.
In Phase II the results of the analysis
are applied to the automatic pre-indexing of new documents.
Also in
Phase II, the pre-indexes that are generated are evaluated to obtain a
measure of pre-indexing quality.
CHAPTER II
PHASE I - DATA GATHERING
To analyze profitably the titles,
abstracts,
and subject indexes of a
large number of documents in order to gain experience and data for later
pre-indexing,
it is necessary first to define the goals of pre-indexing.
The goal of automatic pre-indexing is to extract from the title and
abstract of a document the set of words that best describes the contents
of the document.
In this research,
it is assumed that those words in the
title and abstract of a document that also appear in the human-generated
subject index for the document are the best possible choices for the
words of the pre-index.
Thus, the goal of pre-indexing is to select from
the title and abstract all those words which the human indexers will
eventually include in the subject index.
Of course,
the pre-index may not always achieve the goal.
Figure 2
shows the relationship of the title-and-abstract words to the words of the
subject index of a document.
is the set of words TA
n
The target set of words for a pre-index PI
SI (read:
Abstract and Subject Index).
the set of words common to Title/
Words which are eventually included in the
human-generated subject index but are not found in the title and abstract
of the document are ignored in this study (the set of words SI-SI
nf TA).
Two types of errors are possible when words are selected for a
pre-index.
cluded.
The first is the omission of some words that should be in-
As a result of such omissions, the pre-index is incomplete.
The
second type of error is the inclusion in the pre-index words that should
not be included.
This second type of error decreases the relevance of
the words in the pre-index.
Definitions for two measures of pre-index
quality which indicate how well the two errors are avoided can now be
formulated as follows (see Fig. 3):
Completeness - The percentage of words in TA n SI,
the target pre-index, that are actually included in the
pre -index
Relevance - The percentage of words in PI, the actual
pre-index, that are also included in the target set,
TA n SI
-6-
-7-
Words of the
subject index (SI)
Words of the title
and abstract (TA)
Words that should be
selected for a pre-index
Fig. 2 The Target Set of Words for Pre-Indexing (Shaded Area)
Words that failed
to be selected
SI
Irrelevant words
Fig. 3 The Pre-Index
-8If both completeness and relevance of a pre-index are 100 percent, then
the pre-index exactly matches the target set of words,
TA
n SI.
In this. research the standards defined above are based on assumptions that must be remembered when trying to determine the true value
of an automatic pre-index.
First, the measurements use the human-
generated index as reference standards of completeness and relevance.
Second, the standards assume that enough information is contained in
the title and abstract of a document to extract a valid pre-index from
only those two sources.
Unfortunately,
the quality of the human indexing
cannot be evaluated objectively until the INTREX catalog is used operationally.
In the meantime it appears safe to assume that the human
index is a valid index and reference standard.
An examination of subject
indexes for several documents shows that most of the words of the subject index appear in the title and abstract of a document.
Furthermore,
the work of Salton at Cornell indicates that use of only the title and
abstract of a document is a good procedure for generating a derivative
index by machine.
There will be occasion te question the validity of the
two measures of pre-indexing later, but for the time being they appear
to provide a reasonable basis for an investigation of pre-indexing
methods.
Now what remains to be determined is a statistical basis for a
pre-indexing system.
The next section describes the analysis of many
documents which already have a subject index, as well as a title and
abstract, so that some decision rules may be devised for the selection
of words for a pre-index.
The Analysis of Documents Already Having Subject Indexes
The purpose of analyzing document records which already contain
subject indexes is twofold.
First, the analysis should yield information
for the formulation of possible decision rules for later auto-pre-indexing,
and second,
the analysis will result in a dictionary of all words that have
appeared in the titles and abstracts of all the documents analyzed.
Such
a dictionary is employed in the pre-indexing of new documents.
In this research, 80 documents already having subject indexes,
as
well as titles and abstracts, were analyzed by computer according to the
techniques formulated below.
The subject index of each document
positron Iannihilation
in potassium
(1);
weight
magnetic
fieldI orientation
(0);
subject terms
Fig. 4
Typical Subject Index
Title
Effect of
in
magnetic| field
Abstract
The I angularl correlation ·**
* *
of potassium I metal
Fig. 5 Typical Title and Abstract
potassium
-10comprises a list of subject terms.
A term may include one word or
several words combined into a grammatical phrase.
of a typical subject index is shown in Fig. 4,
the title and abstract are given in Fig. 5.
A representation
and representations of
In order to compare the
words in the title and abstract with the words in the subject index, the
analysis system in the Phase-I part of the pre-indexing process must
convert title and abstract into one list of words and the subject index
The converted lists are shown in Fig. 6.
into another.
TA
Effect
SI
(1)
magnetic
1
annihilation
Title Words
.
The list for
and weight
in
potassium
potassium
The
(0)
magnetic
angular
correlation
of
) Abstract Words
2
d Subject term
and weight
field
orientation
!
*
1:
Fig. 6 TA List and SI List
the title and abstract words is called TA;
words, SI.
the list for subject index
The purpose of these lists is to allow simple counts of the
usage of words in the title, abstract, and subject index.
In both the title and the abstract sections of TA, each different
word is entered only once, regardless of the number of tokens of that
word that actually occur in,
For instance,
for example,
the abstract of a document.
the word "of" may occur four times in the abstract of a
-11-
document.
of TA.
However, "of" will appear only once in the abstract section
If the word "of" also occurs in the title of the document, then
"of" will also be entered once in the title section of TA.
subject term is considered separately.
In SI, each
A subject term is reduced to a
list of words in a manner similar to that for the listing of the title,
The
word lists for all subject terms are placed sequentially in SI, and each
subject term fills its own block of SI.
Thus, the word "of" may occur
more than once in the entire SI list, but only once in each of several
subject terms.
Each subject term is also marked with its weightnumber.
Two definitions help in understanding the counting procedures used
in analyzing documents.
First, an appearance of a word is the occur-
rence of that word in either the title or abstract.
A word may have only
one title appearance and one abstract appearance per document,
Note
that title appearances are counted separately from abstract appearances,
Given that a word appears in a document title or abstract, then the
number of times that the word occurs in SI of the same document is defined as the usage of the word.
A word may have a usage count greater
than one since the word may be used in several subject terms.
The
usage count of a word having title appearances is kept separately from
the usage count of the same word also having an abstract appearance,
This distinction has been made because the significance of a word appearing in the abstract is different from the significance of the same word
appearing in the title of a document.
The usage for each word is also broken down into usage by the
weight of the subject terms in which it occurs.
Thus, a word may have
a total usage of. three, with a usage of two in weight-2 terms and a
usage of one in weight-4 terms.
This distinction of usage by weight is
made to determine if the weight usage of a word provides any clues for
selecting words for a pre-index.
not particularly significant.
Results indicate that weight usage is
Only the aggregate usage is meaningful.
At first glance it may appear somewhat arbitrary to permit a word
to have only one appearance in a title or abstract.
However, allowing
multiple appearances would create a difficult problem in counting the
usage for a particular word.
With multiple appearances,
it would be
impossible to determine just which appearance is responsible for the
usage of that word in the subject index.
Moreover, the important
statistic is not so much how many tokens of a word appear in the title
or abstract, but rather, given that the word appears at all, how likely
is that word to be used in the subject index.
Building a Dictionary
In order to provide a future basis for judging the significance of
words for pre-indexing,
the data that are collected for appearances and
usage of the words of the title and abstract must be stored in a dictionary.
As each new document record is analyzed,
the words from the
title and abstract of the document, along with the appearance and usage
data for those words,
are added to the dictionary.
Thus, the dictionary
will be a list of all words that have appeared in a title or abstract to
date, along with cumulative data about appearances and usage.
data portion of the dictionary,
In the
title data and abstract data are separated
in the same manner that the data were collected, although the dictionary
includes each word only once.
As stated previously,
80 document records were analyzed auto-
matically by computer in the manner just described.
Then the data
contained in the dictionary were inspected in order to learn about the
statistical basis for the usage of words in the subject index.
at this stage,
Note that
the data that has been collected merely represent the
behavior of the human indexers.
One item of interest is the number of words that the dictionary
contains, since the number of words in the dictionary may influence the
usefulness of the dictionary for pre-indexing.
Figure 7 shows the num-
ber of words in the dictionary as a function of the number of document
records analyzed.
The size of the dictionary is growing less rapidly in
the region of 80 documents than it was in the first few documents of the
collection.
New words were added to the dictionary at the rate of
approximately 35 per document during the analysis of the first 20 documents.
After 80 documents,
words per document.
the dictionary grew at the rate of 24 new
Such a growth curve indicates that the dictionary
includes many common words after only a small sample of documents
has been analyzed, but that new words will be encountered frequently in
new documents.
Later, in the design of a pre-indexing scheme for new
-13 documents, it will be necessary to provide for those words that are not
included in the dictionary, but appear in the title or abstract of a new
document.
x
x
Z 1600
I--
cx 1200
X
-/
u. 800
X
Z
400
0/
0
I
l
20
I
I
40
i
I
60
I
I
80
NUMBER OF DOCUMENT RECORDS
Fig. 7
Size of Dictionary as a Function of the Number of Documents Analyzed
The dictionary does not include all words that have been used in
subject index terms.
The dictionary is intended to provide information
on the selection of words from the title and abstract.
I
Figure 8 shows
aSI
Fig. 8. Set of Words Included in the Dictionary Data Base (Shaded Area)
again the relationship of the set of title and abstract words to the set of
subject index words.
the dictionary.
All the words in the shaded portion are filed in
They include all title and abstract words.
The words
-14in the unshaded portion are not included.
Although the unshaded set of
words may well be good words for inclusion in a subject index for future
documents, it is not the purpose of the dictionary merely to list such
words.
Since the words represented by the unshaded portion of Fig. 8
are not directly connected with the words of the title and abstract, the
words of the unshaded portion are omitted.
Possible Pre-Indexing Criteria
One possible clue for the selection of a word for a pre-index is
the frequency of appearance of the word in document records already
analyzed.
The frequency of appearance of a word refers to the percent-
age of the document records analyzed in which the word has made an
appearance.
Thus if a word has appeared in 40 of 80 abstracts, then
the word has a 50 percent frequency of appearance in abstracts.
The
same word may have a frequency of appearance of only ten percent in
titles.
Figure 9 shows the number of words from the dictionary in
Percentile
Percentile
of
Frequency
100
90 - 99
80- 89
70- 79
60- 69
50- 59
40 - 49
30 - 39
20-29
10- 19
0- 10
Fig. 9
Number of words in percentile
Title
Abstract
0
0
0
2
3
2
0
1
0
0
3
0
6
410
1
3
3
3
6
16
71
2009
Frequency of Appearance for Words in the Dictionary
Based on 80 Document Records
-15various percentiles of frequency for both title appearances and abstract
appearances.
High-frequency words tend to be function words -- mostly
prepositions and articles.
a document.
Such words convey little about the content of
Titles tend not to include high-frequency words.
Another possible criterion for pre-indexing is the usage rate of
words in the dictionary.
abstract),
Given that a word has appeared in a title (or
then the usage rate of a word is the ratio of the number of
times it is used in subject indexes to the number of appearances in titles
(or abstracts).
That is,
usage rate = (cumulative usage)/(no.
of appearances).
The usage rate, then, is just the average usage of the word per appearance.
A word has two usage rates.
appearances in titles;
One usage rate is based on its
the other, on its appearances in abstracts.
A
high usage rate for a word means that whenever the word appears in a
title (or abstract) it has a high probability of being used in the subject
index.
Thus,
there are two main types of information available for words
in the dictionary -- usage rate and frequency.
Figure 10 shows the usage
rate of the words in the dictionary as a function of frequency of appearance.
Since the usage of a word can be higher than the number of appear-
ances,
the usage rate has been truncated at a maximum value of 1.
dashed lines in Fig.
quency.
The
10 are the average usage rates as a function of fre-
Each word in the dictionary is actually represented by a point
somewhere on the two-dimensional graphs.
(The points that are shown
in the figure are merely dummy points for purposes of illustration. )
The data represented in Fig.
10 are an aggregate of the usage data
for the different weight usage counts.
Although the usage-count data
that is stored in the dictionary is separated by weight usage, the usage
rates shown in Fig.
all weights.
10 are determined from the total usage counts for
A graph showing usage rate by the individual weights yields
similar results.
With title words, however,
in weight-l subject terms.
most of the usage occurs
Title words have an extremely high proba-
bility of being used in subject terms -- over 90 percent.
The only pecu-
liarity of abstract word use is that high-frequency words are seldom
used in terms of weight (0) or weight (4),
indicating that terms of such
-16 -
Usage
Rate
1.00
x…
X x_
xx…x
_
-
x
AVERAGE USAGE RATE
0.75 -
x
'
total usage
o. of appearances
0.50
x
0.25
o.oo x
20
60
40
no. of appearances
no. of documents
(a)
Usage
Rate
1.00 xxx
x
100
80
Frequency
/
of Appearance
(percent)
Title Words
X
x
/
x
AVERAGE USAGE RATE
0.75
0.75 _
total usage
/
/
/
no. of appearances
/
50
x
x x
0.25 -x
0.00
x
x
xxxxx
x
40
20
(b)
Fig. 10
60
80
100
no. of appearances
Frequency
no. of documents
of Appearance
(percent)
Abstract Words
Average Word Usage Rate as a Function Frequency of Appearance
-17-
weights tend to be quite specific.
High-frequency "function" words,
such as prepositions, are used in phrases of other weights in order to
make such terms readable noun phrases, for example,
"positron
annihilation in potassium".
There is a useful characteristic of word usage that does not show
in Fig. 10, but is clearly demonstrated in Appendix A.
Very few low-
frequency words have usage rates at intermediate levels near the average usage rate.
The vast majority of low-frequency words are either
used very seldom or very often,
Thus, the dictionary contains a large number of words that the
indexers have used in the subject indexes of documents.
However, the
dictionary also contains a large number of words that the indexers have
consistently failed to use in subject indexes.
which never appear in subject terms,
Such words include verbs,
Hence, the dictionary contains
information that will be useful when dictionary words are encountered
in pre-indexing new documents.
When a word encountered in a new
document is found in the dictionary, the data on usage rates will indicate
whether the human indexers have considered the word as useful for indexing.
Of course, words that are not in the dictionary will be discov-
ered in new documents.
Rules for handling both known and new words
will be necessary in generating a pre-index.
The next chapter describes the actual methods to be used for
pre- indexing.
CHAPTER III
THE DEVELOPMENT OF PRE-INDEXING SCHEMES
All the automatic pre-indexes that have been evaluated have a
very simple form.
Basically, the pre-index is a list of words which is
intended to represent the content of a document.
All the pre-indexes
generated are the result of simple word-by-word selection procedures
to choose words from the titles and abstracts of documents.
Only in
one of three methods examined are words considered in context.
A
random sample of 30 documents with both titles and abstracts has been
used for the testing of pre-indexing.
The Form of the Title and Abstract for Pre-Indexing
All pre-indexing methods tested in this research operate on a list
of the title and abstract words in the document to be indexed.
paring a title and abstract of a document for pre-indexing,
In pre-
we have made
a slight change from the form of the TA list as described in the preceding chapter.
For pre-indexing,
a single list of title and abstract
words in a document is prepared, but now the list gives all words in
exact order of their occurrence,
including multiple occurrences.
Such
a listing preserves information on context, which may be useful for preindexing.
Three Methods of Pre-Indexing
Three different algorithms for pre-indexing from a title and
abstract word list have been developed.
common features.
The three methods have some
The similarities in the methods lie in the handling
of title words and in the handling of abstract words not found in the dictionary.
The primary differences in the methods are in the way that
the dictionary data are used.
For all methods, the words of the title of the document are included in the pre-index.
The data contained in the dictionary indicates that
including the title words is a very sound decision rule.
From the data
contained in the dictionary it is found that title words have at least a
-18-
-19 90 percent chance of being used in the subject index of a document;
hence title words should definitely be included in a pre -index.
All methods of pre -indexing face the problem of new words,
that is,
the occurrence in title or abstract of a word which is not con-
tained in the dictionary.
All new words, clearly, will be low-frequency
words, since the new words have not appeared in any documents previously.
The usage-rate averages, as graphed in Fig. 10 of Chap-
ter II, show that a low-frequency word in an abstract has only about a
30 percent chance of being used in the human-generated subject index
of the document.
Thus, when a new word appears in the abstract of a
document, there is no valid reason to include it in the pre -index.
Nevertheless,
if all new words are excluded, then many of the words
that belong in the pre-index will be omitted.
As noted previously,
even when a dictionary comprising the words from 80 documents is
used, over 20 new words are encountered in the title and abstract of
a new document.
Therefore, it has been decided that in all three pre-
indexing methods a new word will be included if it appears at least
twice in the title and abstract.
A manual inspection of several docu-
ment records indicates that if the word is used at least twice, then
there is a good chance that the word represents something important
to the content of the document.
New words which occur only once in
the title and abstract appear much more likely to be only filler and
not central to the content of the document.
Actual attempts at pre-
indexing support the notion that only a single occurrence of a new
word is insufficient for the word to be included in the pre -index.
The three methods differ mostly in the handling of words that
are contained in the dictionary.
The primary information available
about words in the dictionary is the frequency of appearance of the
word and the usage rate of the word,
For example, the dictionary
contains the information that the word "the" has appeared in 80 abstracts and has a cumulative usage count from abstracts of 27 in
weight-1 terms, 13 in weight-2 terms, 46 in weight-3 terms, 22 in
weight-4 terms, but only 1 in weight-0 terms.
use this information in different ways.
However, all usage data is
aggregated over all subject-term weights.
breakdown by individual weights.
The three methods
No use is made of the
Thus only the aggregated usage
-20 count of 109 is retrieved from the dictionary.
From the number of
appearances in abstracts and the total usage from abstracts,
it is
then known that the word "the" has a frequency of 100 percent in abstracts and a usage rate greater than 1,.0.
Method I - Usage Rate Only
The first method considers only the usage rate for a word which
is found in the dictionary.
for the pre-index.
pre-index.
Words with a high usage rate are selected
Words with a low usage rate are excluded from the
Figure 11 illustrates method I.
Words with usage rates
falling in the shaded area are included in the pre-index for the document.
The exact level of usage rate, R, that is required for a word
to be included in the pre -index is tested at three different threshold
levels to determine what effect the usage rate threshold has on the
completeness and relevance of a pre-index.
The three threshold
levels of the usage rate, R, that have been tested are 0.25,
0.50,
and 0.75.
Method II - Usage Rate and Frequency Thresholds
In the second method, not only the usage rate of a wordis considered, but also the frequency of appearance of the word is considered.
Since very high-frequency words tend to carry little in-
formation,
such words may possibly be excluded from the pre -index.
Therefore,
in method II the pre -index will include only low-frequency,
information-bearing words.
teria for method II.
Figure 12 illustrates the selection cri-
Words whose frequency and usage rate fall in
the shaded area are included in the pre-index.
words are excluded.
All other dictionary
In tests of this method, the usage -rate thresh-
old, R, has been held constant at 0.5 so that the effect of varying the
frequency threshold,
F,
may be observed.
has been tested at 25 percent,
50 percent,
The frequency threshold
and 75 percent.
Method III - Context Decisions
The third method of pre-indexing is
In the other two methods,
somewhat more involved.
a word is either selected or not selected
for the pre-index purely on the basis of the dictionary data for that
Inclusion Region
1.0
L
i-
0 , °75
R
0.50
0.25
50
FREQUENCY
Fig. 11
100
(PERCENT)
Word Inclusion Region for Method I
/Inclusion
0ii
Region
u 0.75
V 0.50
I
D
I
0.25 -
I
50
F
FREQUENCY
Fig. 12
100
(PERCENT)
Word Inclusion for Method II
Strong zone
0.2
0
.50
FREQUENCY
Fig. 13
150
F
(PERCENT)
Word Zones for Method III
-22 word only.
The third method of pre -indexing recognizes an "ambigu-
ous" class of words, for which the pre-indexing method considers
neighboring words before making a decision to include or exclude the
ambiguous word.
Figure 13 shows two thresholds,
rate and a threshold frequency F,
pre-indexing.
R1 andR2 on usage
being used in the third method of
Only words with a frequency below the frequencythresh-
old F are considered as candidates for the pre-index.
Words whose
usage rates fall in the zone between the two usage rate thresholds R1
and R2 are considered to be in the ambiguous zone.
Weak words,
which have usage rates lower than R1, are immediately excluded from
the pre-index.
Strong words, whose frequency and usage rates fall
in the cross-hatched region above R2 in Fig.
cluded in the pre-index.
13, are immediately in-
Words with usage rates between
R1 and R2
(the shaded region of Fig. 13) are marked for later decision.
If
such an ambiguous word neighbors a strong word on either side,
then the ambiguous word is included in the pre-index.
the ambiguous word is
However, if
surrounded only by weak words then the am-
biguous word is excluded from the pre-index.
This method was de-
vised purely as an experiment to aid in the handling of words whose
usage rate is at an intermediate level and for new words whose usage
rate is unknown.
For such words there is no sound basis for a se-
lection decision,
so content is considered in this method.
In essence,
this method recognizes simple phrases that are delimited by weak
words such as articles, prepositions, and verbs.
the phrase ". .the
For example, from
dynamic pulse hysteresis in...",
method III will
select the words "dynamic pulse hysteresis."
In method III as in the other two methods,
if a new word (a word
not contained in the dictionary) occurs at least twice in the title and
abstract of a document, then that word is included in the index.
such a new word occurs only once,
it is
If
considered to be an ambigu-
ous word and is included only if it has a strong neighbor.
This de-
cision rule was adopted because a manual inspection of many abstracts
indicated that although many meaningful words may appear only once
in the title and abstract of a document,
such words usually seem to
occur next to other meaningful words.
Through consideration of single-
occurrence new words as ambiguous rather than weak,
fewer
-23 -
meaningful words are omitted from a pre -index.
Later pre -indexing
trials showed that the primary difference between method II and
method III lies in the treatment of new words with single occurrences.
Recall that such words are excluded from a method II pre -index.
In tests of method III the frequency threshold was held constant
at 75 percent so that the effect of varying the ambiguous usage zone
could be observed.
For testing, the ambiguous zone was set at three
different ranges -- 0. 2 to 0.5, 0.3 to 0.5, and 0.3 to 0. 6.
Testing the
Pre -Indexing Methods
Thirty documents were automatically pre -indexed in order to
test the above methods.
To facilitate experimentation, the automatic
pre-indexing system was developed to pre-index each document by
all three methods with only one pass on a document.
To provide
further experimental efficiency, the pre -indexing system also preindexed under three different parameter sets for each method.
In
addition, the pre -indexing system also compared the different trial
pre-indexes of each document to the human-generated subject index
in order to yield immediate values of completeness and relevance for
the pre-indexes.
Completeness and relevance are the standards of
quality as defined in the previous chapter.
the results of the tests of pre-indexing.
The next chapter describes
CHAPTER IV
RESULTS OF PRE -INDEXING TRIALS
To test the pre -indexing methods that were described in the
preceding chapter,
used.
30 documents having titles and abstracts were
For each of the documents,
generated.
nine different pre -indexes were
The nine pre-indexes are the result of using the three
methods, each with three parameter sets.
The testing demonstrated
that the methods cause significant differences in pre -indexing results,
although changing parameters within a method has little effect on the
quality of a pre -index as measured by the standards of completeness
and relevance.
A simple way to view the results for a pre-indexing method is
to plot the completeness and relevance for each document pre -index
as a point on a graph.
The closer that a document pre -index is to
100 percent in both completeness and relevance, the better the preindex, at least in a primitive way.
indicate that there is
The results of pre-indexing trials
cause to question the measure of completeness
as it has been defined previously.
Figs. 14,
However,
for the time being,
15, and 16 show the plots of completeness and relevance
for pre-indexes by methods I, II,
and III respectively.
Although three
different parameter sets were tested for each method, a representative parameter set has been selected to illustrate the results of each
method.
Figure 17 summarizes the results for all three parameter
sets for each of the three methods.
In Fig.
17 only the average re-
sults of completeness and relevance are plotted.
Appendix B gives
the results of all pre-indexing trials in tabular form.
Subjective Evaluation of the Informational Content of Pre-Indexes
One of the first questions that comes to mind is,
"Does an auto-
matic pre-index appear to describe the subject content of the document?"
An inspection of pre-indexes for several documents indicates
that, indeed, the pre-indexes contain reasonably accurate subject
matter from the title and abstract of documents.
-24 -
For example,
from
-25 -
100
x
xxx
X
X
x
XK
x
x
x
x
x
x
Z~
X
X
x
' X
_x
x
Z
.
60
-
"'
Z
Parameter used:
Usage-Rate Threshold 0.50
40
O
U
AVERAGE: (
COMPLETENESS
RELEVANCE
20r
0
20
.!
I
I
I
I
o-
86 percent
61 percent
RELEVANCE
Fig. 14
I
,
60
40
I
,
80
100
(PERCENT)
Results of Method I for 30 Documents
100
Parameters used:
Usage-Rate Threshold 0.50
Frequency Threshold 75 percent
80
x
Z
x
;60 -
x
Z
-
xx
x
Z
K x
x
.
40
X
X
X
x
zX
x
t)r~~x
xKX
x
K
X
X
O
0
U
AVERAGE:
20 -
o
53 percent
RELEVANCE
62 percent
20
20
0
®
COMPLETENESS
40
40
RELEVANCE
Fig. 15
60
60
80
80
(PERCENT)
Results of Method II for 30 Documents
100
100
,
-26 j
100
-
Parameters used:
Usage-Rate Thresholds, R1 0.3; R2 0.6
Frequency Threshold
75 percent
80-x
X
x
60 -
x
Xx
)
x
x
X
_
X
K
40
x
x
K
20
i
ol
0
AVERAGE:
)
COMPLETENESS
59 percent
RELEVANCE
57 percent
I
1
20
RELEVANCE
Fig. 16
I ~
100
I
I
80
I
I
60
I
I
40
(PERCENT)
Results of Method III for 30 Documents
100
1
METHOD
(gB '\
80
I
C
U
METHOD
/A
A
m
60o
METHOD II
B \
\
I
O 40
U
20 _
0
0
A, B, and C indicate the different
parameter sets used. The values of the
parameters are included in Appendix B.
I
20
I
40
I
RELEVANCE
Fig. 17
I
60
I
80
1.
100
(PERCENT)
Comparison of Average Results for all Methods on 30 Documents
-27 the title and abstract of an article about "Dynamic Pulse Hysteresis
in Magnetic Devices",
the following is a partial list of terms in a pre -
index generated by method III.
instantaneous hysteresis
magnetic
risetime
an applied pulse field
dynamic pulse hysteresis
magnetic devices
quasi-static hysteresis
magnetic damping phenomena by
extended
with
The pre-indexing method overlooked some words that are pertinent,
the pre-index contains a sig-
such as "ferrite core", but in general,
nificant number of the important words from the title and abstract.
Only a few words of the pre -index words are not pertinent to the topic,
such as "an",
"by'',
"with",
and "extended".
In general the pre-
indexing methods are very good at eliminating verbs from pre -indexes.
The word "extended" is included because it has a high usage rate
from prior usage by the human indexers, who probably used it as an
adjective rather than a verb,
Overall, however,
the pre-indexing
system appears to do an effective job.
The Role of Function Words
The results from method I and method II,
Figs.
14 and 15, apparently show that method I is far superior to
method II.
vance;
as illustrated in
Both methods yield very nearly the same measured rele -
however, method I gives a completeness measure a full 33
percent higher than method II,
Yet the only difference in the two
methods is that method II eliminates high -frequency words from a
pre-index,
such as "in",
"to", "for",
"the", and "of".
include all title words in the pre -index.
Both methods
Both methods have the same
usage rate threshold for words found in the dictionary.
Moreover,
both methods treat new words in exactly the same way.
But method II
excludes numerous function words by excluding high-frequency words.
Such function words have little informational value.
In fact the two
most frequent words that still carry information are "magnetic" and
-28 'field", with frequencies just under 30 percent.
Words of such low
frequency have not been eliminated from pre-indexes.
However,
human indexers include high-frequency words in subject index terms
in order to make noun phrases.
(In INTREX each word of a noun
phrase is being placed in an inverted file of subject words and its
position within the phrase is being recorded.
The purpose of this pro-
cedure is to allow users of the INTREX system to state their requests
INTREX plans to test and
in terms of noun phrases, if they wish.
evaluate the merit of this kind of capability.)
When method II elimi-
nates such high-frequency function words from a pre -index, the completeness of that pre-index is reduced.
Nevertheless,
mational value of that pre-index is unchanged,
function words carry no information.
the infor-
since the excluded
The comparison of method I
with method II shows that fully 30 percent of all word occurrences
that are eventually found in the human-generated subject index are
merely informationless function words.
Thus, the comparison of the
two methods shows that the measures of completeness and relevance
unfortunately do not fully indicate the quality of a pre-index.
How-
ever, if the problem of high-frequency words is kept in mind, then the
measures of completeness and relevance still give at least a feeling
for the relative results of pre -indexing trials.
The Effect of Varying the Usage-Rate
Threshold
Within both methods I and III the usage-rate thresholds are
varied to determine if the exact setting of the usage rate threshold
affects the pre -index significantly.
In method III, where high-
frequency words are excluded from the pre -index, the settings of the
usage -rate thresholds have very little effect on the completeness and
relevance of the pre -indexes.
This is to be expected,
since only five
percent of all low-frequency words found in the dictionary have usage
rates that fall in the intermediate range from 0.20 to 0.60, the range
in which the thresholds were varied.
Thus, the particular setting of
the usage-rate threshold has negligible effect on the pre-indexes
generated by method III.
However,
in method I the setting of the
simple usage-rate threshold seems to have a much greater effect on
the pre -indexes generated by this method.
This is not surprising
-29when it is remembered that method I includes high-frequency words
in the pre-index;
their inclusion depends on the usage rates of such
high frequency words.
Since the usage rates of several of the higher
frequency words fall in the range of trial variation of the usage rate
threshold for method I, the completeness and relevance are more
sensitive to the usage-rate threshold,
The Effect of New Words on Pre-Indexing
Examination of Fig. 17 shows a noticeable difference in the pre indexing results between method II and method III.
However, from
the preceding discussion, it appears that there should be little difference because of the different ways of setting the usage-rate thresholds for the two methods, since both methods eliminate high-frequency
words.
Also, recall that both methods include new words that appear
at least twice.
The difference in the results obtained from the two
methods arises from the different way of handling new words that
occur only once in the abstract of a document.
In method II, such
words are considered as ambiguous words and may be included in the
pre-index under the proper circumstances.
By including such single-
occurrence new words, method III has a higher completeness than
method II, since the inclusion of any word in a pre -index can only
"help the measured completeness of that pre -index,
On the other hand,
some of the new words that method III selects for -inclusion in the preindex are not used in the human-generated subject index;
hence the
relevance of method II pre -indexes is lowered slightly.
In the early experimental stages of this research, all methods
of pre -indexing included all new words regardless of the number of
occurrences of that word in the title and abstract of the document.
However, changing the decision rule to require at least two occurrences of a word for definite inclusion in a pre -index appears to have
significantly increased the quality of the pre -indexes.
Under all
methods of pre -indexing, the relevance of pre -indexes increased by
approximately eight percent, while the completeness suffered byonly
two to three percent when the decision rule was changed to require at
least two occurrences.
-30The Completenes s-Relevance Trade -off
Further examination of Fig. 17 clearly shows that within a preindexing method,
pleteness.
relevance must be sacrificed in order to gain com-
Moreover,
comparing method II and method III, which
differ most significantly in policy toward new words,
that the same trade-off is made between methods.
pected.
ness.
one also finds
Such is to be ex-
Adding any word to a pre-index can never reduce completeIn fact if every word of the title and abstract of a document is
included in the pre-index, then the completeness for that pre -index
will be 100 percent.
But the relevance of that pre -index will be ap-
proximately 35 percent,
since only that percentage of words in the
title and abstract of a document are used in the average humangenerated subject index.
(The analysis of document records in
Phase I showed that approximately 35 percent of the words of the
title and abstract eventually appear in the human-generated subject
index.)
When words are eliminated from a pre -index in order to in-
crease relevance,
there is the statistical probability that some of the
eliminated words are actually necessary to maintain completeness.
Hence,
completeness must be sacrificed if rele--ance is to be raised.
Moreover, as is found by comparing method I with method II, a full
30 percent of completeness can be sacrificed without damaging the informational content of a pre -index merely by removing the high
frequency words from the pre -index.
The Effect of Dictionary Size
Even with a dictionary containing all the words from the titles
and abstracts of 80 documents,
many new words are encountered in
the pre -indexing of new documents.
An examination of the curve of
dictionary growth in Fig. 7 indicates that even if the dictionary were
obtained from a much larger document base, new words in titles and
abstracts would continue to have an important role in pre -indexes.
However,
the dictionary does contain a good collection of 40 very
common words, which appear in over 30 percent of all document abstracts.
Also, an examination of several pre -indexes reveals that
the dictionary contains most of the verbs which are encountered in
typical abstracts.
The three methods of pre-indexing are very good
at recognizing and excluding verbs from pre-indexes, for example,
forms of "to be", "to report", "to have", and "to discuss". In fact
a large portion of the selection decisions that the pre-indexing
methods must make about words found in the dictionary result in the
exclusion of those words, rather than the inclusion of the words.
In
addition, most decisions must be made about common words, since
such words appear most frequently. Therefore, the dictionary used
in this research is of sufficient size to provide a reasonable test of
the postulated pre-indexing methods. Only a very much larger dictionary would alter the proportion of new words encountered, and it
is doubtful that the results would be substantially different.
CHAPTER V
CONCLUSIONS AND SUGGESTIONS FOR FURTHER RESEARCH
The primary conclusion to be drawn from this research is that
automatic pre -indexing by machine is feasible with the application of
simple techniques.
Furthermore,
the implementation of the pre -
indexing methods developed here can be accomplished with computational efficiency if the pre -indexing system is designed as an operational system rather than as an experimental system.
The first step in the design of an operational pre -indexing
system is to select a suitable pre-indexing method.
It is the opinion
of this author that method I is not so useful for pre -indexing as are
the other two methods.
Recall that method I allows the inclusion of
high -frequency function words in the pre -index for a document,
In a pre -index which is
whereas the other two methods exclude them.
a list of descriptive words,
function words have no place.
The remaining pre-indexing methods,
II and III, present a
trade-off between completeness and relevance.
Method III, which in-
cludes a greater percentage of the new words that it encounters than
does method II,
tends to be slightly more complete,
but somewhat less
relevant.
However, it is necessary before one judges the relative merits
of methods I and II to look more closely at the figures for completeness for both of these methods.
The graph of Fig. 17 shows method II
with an average completeness of 53 percent;
method III, 59 percent.
From the prior examination of method I, recall that approximately 30
percent of the completeness measured for a pre -index is lost if highfrequency words are removed.
That particular 30 percent of com-
pleteness can be safely ignored in considering informational content,
since the high-frequency words removed in methods II and III are all
merely function words.
Thus in terms of residual informational con-
tent, the true average completeness of methods II and III probably
lies in the range of 80 percent to 90 percent, after making a correction
for the extraneous 30 percent.
Thus, both methods II and III appear to
-32 -
-33 contain a substantial portion of the informational content of the title
and abstract of a document.
Nevertheless, method III, which includes more of the new words,
would probably be the best pre-index for an operational system because of its greater completeness.
The price that must be paid for
the greater completeness is lessened relevance, but the larger preindex generated by method III should not create an undue burden.
Another distinction between method II and method III is that
method III recognizes and provides special processing for words
having intermediate usage rates.
Because a small dictionary was
used in the pre -indexing tests, this difference in methods caused only
very small differences in results from method II;
with a greatly ex-
panded dictionary, method III would probably become even more useful.
Streamlining the Pre-Indexing System
Much can be done to streamline the pre -indexing system for
either operational use or further experimentation.
Most of the pos-
sible improvements should be made in the organization of the dictionary files,
Presently, the files contain an excessive amount of data
that were collected in the automatic analysis of documents.
At the
time the data were collected, there was no way to determine just what
data would be useful,
From an examination of the accumulated data,
however, it appears that the breakdown of usage information by
subject-term weights can be discarded, and only an aggregated form
of usage data retained,
Such a condensation would reduce the size of
the files by 35 percent.
The use of sorting techniques in organizing
the dictionary would improve the efficiency of the pre -indexing
system.
Proper organization would allow the use of faster search
techniques in dictionary lookups.
Such organization of the dictio-
nary and implementation of improved search would be necessary for
an operational system, or even for extensive additional research,
a pioneer experimental system with a relatively small dictionary,
such refinements are not justified.
In
-34Additional streamlining can be performed on some of the programming.
Several operations that at first appeared necessary for
a preliminary experimental system are not clearly superfluous.
Applicability of Pre-Indexing to other Subject Areas
All of the documents used for this pre -indexing research fall in
the area of materials science and engineering.
Thus, the dictionary
contains the special vocabulary used by scientists and engineers of
that field.
A question arises as to whether the dictionary and pre-
indexing system developed in this research are useful in other subject
areas.
Unfortunately,
there is no document collection from another
field available for direct experimentation.
However, the nature of the pre -indexing system would probably
allow it to be effective in another technical field.
The pre-indexing
system does not work solely by choosing recognizable words from a
title and abstract;
In fact,
the system also eliminates many common words.
most word decisions made by the system are to eliminate a
word rather than to include it.
Many of the words very common to
materials science are function words and therefore are also very common in other fields.
such words,
The pre-indexing system is good at eliminating
since those words are the verbs and prepositions.
Hence,
the dictionary would still be at least partially effective in some other
field.
Suggestions for Further Research
One item of obvious interest is the effect of dictionary size on
the results of pre -indexing trials.
The dictionary used in these tests
contained approximately 2200 words.
affect the results in two ways.
A much larger dictionary -nay
First, the number of new words en-
countered in pre -indexing may be reduced.
Second, the distribution
of the usage rates of the known words may change,
so that the setting
of the usage-rate thresholds in the different methods may become
more important.
For example,
in a dictionary built from 80 docu-
ments, many of the words contained in the dictionary have appeared
in documents only once.
only either 1.0 or 0.0.
Hence, the word can have a usage rate of
But if the dictionary were to be based on the
-35analysis of 400 documents then perhaps several of the low-frequency
words would have a sufficient number of appearances,
usage rates could fall around 0,5.
so that their
An analysis of the data contained in
the dictionary as it grew to its present size indicates that any significant change in the distributions of the usage -rate data is far in the
future, if in the future at all,
A comparison of a dictionary based on
35 documents with a later dictionary based on 80 documents showed
only a very minor change in the distribution of usage rates.
In both
cases, the usage rates fell primarily at the two extremes.
The effect of reducing the number of new words is also hard to
predict.
Again it is very probable that a very much larger dictionary
is necessary to substantially affect the number of new words encountered in the title and abstract of a document,
Figure 7 shows
that the number of new words per new document is dropping slowly
after 80 documents.
In fact, a base of 400 documents might be neces-
sary to decrease the number of new words per document from over
20 at present to about 10.
A shortage of computer time prevented tests with an expanded
data base.
However, first a streamlining of the pre -indexing system
to a more operational form, followed by testing with a larger dictionary, would no doubt be worthwhile as a prelude to making such a preindexing system operational.
The Possibility of Using Word Stemming
The value of stemming in a pre -indexing system was not tested
during these experiments.
However, stemming mayhave two very
useful effects on a pre -indexing system.
One immediate effect is
to reduce the size of the dictionary that is stored as the data base.
Many different forms of a root with various endings are-presently
stored separately in the dictionary.
However, by stemming, all
words that have the same root can be lumped together in one dictionary entry.
For example, the entries for "magnetic" and "magne-
tism" can be merged.
Such a consolidation of data significantly re-
duces storage requirements.
Moreover, the stemming of words also decreases the number of
new words encountered in a new document.
With stemming, a new
-36 ending on a known root would be considered as merely another instance of the known root, rather than as a completely new and unknown word.
APPENDIX A
BREAKDOWN OF WORDS IN DICTIONARY BY
USAGE RATE AND FREQUENCY
(Based on words from 80 documents)
>1.00
368
6
-
3
-
-
1
0.90 - 099
-
-
-
-
-
-
-
0.80 - 0.89
-
-
0.70 - 0.79
1
-
Usage
0.60 -0.69
1
Rate
0.50 -0.59
3
0.40 - 0.49
-
-
-
-
-
-
-
-
-
37
-
-
-
-
0
to
9
10
to
19
30
to
39
40
to
49
50
to
59
60
to
69
70
to
79
80
to
89
90
to
99
100
to
1
-
-
1
2
-
-
0.30
-
0.39
-
0.20 -0.29
-
-
0.100.19
0.00 -0.09
20
to
29
Frequency of Appearance
(percent)
(a)
Title Words
>1.00
660
20
4
0.90099
-
-
1
-
-
0.80 - 0.89
4
2
-
-
-
0.70 - 0.79
9
3
-
-
-
Usage
0.60 - 0 69
22
4
1
1
Rate
0.50 - 0.59
52
3
2
-
0.40 -0.49
6
4
-
0.30 - 0.39
25
5
0.20 - 0.29
15
6
0.10--
0.19
0.00 - 0.09 1210
0
to
9
-
-
-
-
-
-
-
-
-
-
2
-
1
-
1
1
-
-
-
-
-
-
-
-
-_
10
-
-
-
-
-
-
-
-
1
-
-
-
-
-
-
-
-
-
19
9
5
-
3
1
-
1
-
-
10
to
19
20
to
29
30
to
39
40
to
49
50
to
59
60
to
69
70
to
79
80
to
89
90
to
99
100
to
Frequency of Appearance
(per cent)
(b)
-
Abstract Words
-37 -
APPENDIX B
RESULTS OF PRE-INDEXING TRIALS
C = Completeness in percent
in percent
R = Relevance
Method I
A
Usage rate
threshold:
0.25
Doc No.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
300
*302
304
305
364
383
401
402
415
437
606
619
620
621
622
623
625
719
721
72Z
723
724
818
905
907
908
909
911
1102
1106
Average
C
82
93
87
89
78
97
96
100
90
72
88
91
87
78
78
75
84
91
83
90
94
94
97
93
81
95
89
91
92
88
R
72
80
86
92
88
45
48
19
41
63
69
79
48
55
82
29
43
51
56
4i
52
71
65
73
43
67
45
29
65
75
88
55
-38 -
B
C
0.50
0.75
R
C
80 76
89 86
84 87
87 94
78 91
97 48
96 51
100 23
87 45
70 68
88 73
81
91
51
87
78 58
72 83
70 30
82 56
84 53
79 61
89 43
94 53
90 77
87 65
74
93
81 45
70
91
77 43
34
9]
84 68
88 75
C
74
73
85
82
75
94
96
100
87
76
86
87
87
64
71
70
77
80
76
59
94
81
81
65
68
78
73
91
80
79
R
84
88
89
100
96
58
59
33
57
73
87
86
58
57
92
36
52
61
69
37
64
79
78
74
49
77
53
44
80
90
61
79
69
86
-39Method II
B
C
25%
50%
75%
0.50
0.50
0.50
A
Frequency
thre shold:
Usage rate
threshold:
Doc No.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
300
302
304
305
364
383
401
402
415
437
606
619
620
621
622
623
625
719
721
722
723
724
818
905
907
908
909
911
1102
1106
Average
R
C
40 74
34 74
45 89
95
53
42 87
60 68
57 61
36 20
65 72
32 63
46 76
38 69
58 64
41 64
34 96
55 58
43
41
40 47
47 79
35 40
65 69
30 58
42 62
65 87
38 48
38 62
48 63
66 58
57 80
88
63
C
42
39
45
61
45
71
57
57
65
36
51
44
65
41
37
"0
43
46
55
38
74
41
44
65
38
38
50
66
59
67
R
72
74
86
96
85
71
58
27
54
59
73
70
51
63
93
58
42
49
80
3B
66
62
60
79
39
56
50
45
74
89
C
42
42
51
66
47
71
57
57
65
36
51
44
65
49
40
70
45
50
55
40
74
43
44
80
38
38
50
66
59
67
R
66
75
85
96
86
61
53
24
72
58
66
67
57
64
90
50
42
50
72
37
62
64
55
82
36
53
45
42
64
84
67
52
64
53
62
47
-40 -
Method III
A
C
75%
Frequency
threshold:
75%
75%
"ambiguous"
zone:
0.2
'to
0.5
0.3
to
0.5
Doc No.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
B
0.3
to
0.6
300
302
304
305
364
383
401
402
415
437
606
619
620
621
622
623
625
719
721
722
723
724
818
905
907
908
909
911
1102
o106
C
47
49
60
74
53
74
59
57
69
49
54
47
74
56
49
90
52
60
58
46
77
51
50
85
49
43
66
71
67
71
R
57
70
79
97
77
54
41
17
43
57
60
58
50
58
85
43
36
40
62
31
60
55
50
72
35
47
45
32
59
74
C
46
48
60
74
53
74
59
57
69
49
54
47
7153
47
90
52
60
58
46
77
48
49
85
47
43
66
71
67
71
R
58
71
80
97
77
55
43
17
45'
57
60
69
49
57
86
44
36
41
64
33
59
63
50
74
36
48
47
33
60
74
C
46
48
60
74
53
74
59
57
69
49
51
47
71
48
46
83
48
60
58
41
74
43
47
85
47
43
66
71
67
71
R
60
73
80
97
77
58
44
17
45
57
64
53
49
56
89
27
36
42
64
32
59
61
52
74
39
51
48
37
61
74
Average
60
55
60
56
59
57
APPENDIX C
PROGRAM STRUCTURE AND LISTINGS
The programs for both Phase I and Phase II of the pre -indexing
system make extensive use of pointers in order to make data available to all subroutines.
The programs are written in the AED lan-
guage, which has convenient facilities for handling pointers,
The programs are listed and further described by phase.
The
Phase I programs are used to analyze document subject indexes and
generate the dictionary files.
Phase II actually generates pre-
indexes by the three different methods.
All programs are designed to run on the Compatible TimeSharing Computer System at M.I. T. (CTSS).
PHASE I
The main program, MAIN1,
sets up all the data handling arrays.
MAIN1 makes room for a list of title and abstract words, a list of
subject index words, and an array for storing appearance and usage
data.
Pointers to the various arrays are stored in a directory array
PTRS.
Thus most major subroutine calls need only the argument PT,
which is a pointer to the array PTRS.
which is useful to the subroutines.
PTRS also contains other data
A list of the important entries in
PTRS follows:
0
-
Location of TA, the array for title and abstract words,
1
-
Location of SI, the array for subject index terms.
2
-
Location of D, the array for appearance and usage data.
3
-
TL, the length of the title section of TA.
entered by a subroutine.
4
-
AL, the length of the abstract section of TA.
5
-
A scratch location sometimes used to pass arguments
in subroutine calls.
6
-
The number of the current document being operated on.
8
-
Another scratch location.
9
-
The word length in a dictionary lookup.
10
-
The file number in a dictionary lookup.
-41 -
This is
-42 11
-
Another scratch location.
12
-
Location of BOOK FILE.
13
-
Location of dictionary file in use.
Three major arrays are used in the analysis of a document
record.
These arrays are TA, SI,
the title and abstract.
and D.
TA contains the words of
SI contains the subject terms.
D is filled with
the usage and appearance data for the words in TA.
The TA Array
The words in the TA array are stored four characters per computer word, left justified and blank-padded.
Thus a six-character
word will fill two computer words and have two blank characters in
the final two character positions.
The words stored in TA are
separated by a computer word of four blank ASCII characters.
The length of a stored word is defined as the number of computer
words it fills, including the fence of blanks which separate it from
the next word.
Thus, a four character word such as 'film" has a
length of 2--one computer word for the characters plus one computer
word for thie fence of blanks.
The words "magnetic" and field have
the same length of 3.
The length of an array of words is defined as the sum of the
lengths of all individual words in the array.
The length of an array
does not specify the number of words in an array.
The SI Array
The first computer word of the SI array is a header which tells
the number of subject terms in SI.
subject terms.
The header is followed by the
Each subject term also has a one computer word
header which contains the length of the subject term in the decrement
and the weight of the subject term in the address.
The D Array
The D array is used for recording the usage and appearance data
for the words in the TA array.
Three computer words in D are re-
quired for each language word in TA.
For the nth word in TA the
-43 contents of D are as follows (the position of data within a word is
indicated by the octal mask):
D(3N)
- Weight 0 usage from title, 7C29
- Weight 1 usage from title,
1777C
- Weight 2 usage from title, 777CZO
- Weight 4 usage from title,
D(3N+1)
1777C10
- Weight 0 usage from abstract, 7C29
- Weight 1 usage from abstract 1777C
- Weight 2 usage from abstract, 777C20
- Weight 4 usage from abstract,
D(3N+2)
1777C10
- Weight 3 usage from title, 777C27
- Weight 3 usage from abstract, 777C9
- Title appearance,
777C18
- Abstract appearance,
777C
Data are placed in the D array in this particular manner so that
it is formatted for direct addition to the word data already contained
in the dictionary files.
Dictionary Files
The dictionary files contain all words encountered in titles and
abstracts, as well as the cumulative usage andappearance data for
each word. The files are sorted only by word length. Each file is
two tracks (864 computer words) long.
files for each possible word length.
such as..... M03 ...
N04.
Thus, there is a series of
A file is specified by two names,
The first name indicates the length of words
contained in the file, in this case 3,
The second name indicates the
number of the file in the series of files for the given word length. In
this case the file is the fourth of the dictionary files having a word
length of three.
For storage, each word in a file is followed directly by three
computer words containing data about appearances and usage. The
data words have exactly the format described for data storage array D.
-44The Book File
With the dictionary files as described above,
a bookkeeping
system is necessary to keep track of the number of files in each word
length series, and to monitor the number of words stored in the last
number file of each series.
mation.
The Book File maintains this vital infor-
If a file word length is M, then the number of files in the
series is stored in location 2M and the number of words contained in
the last file of the series is contained in location (2M+l) of the Book
File.
Operating Phase I
To operate Phase I, all Phase I programs must be loaded and
started from a CTSS console.
The files CATDIR FILE and CATREC
FILE, which contain the document records, must be available.
program will type the number of documents analyzed to date,
request the number of a new document to be analyzed.
The
then
The user re-
sponds by typing a four digit document number for the new document
to be analyzed.
The computer will signify when the analysis is com-
plete and wait :or another document number.
The Phase I programs
will not analyze a document which is missing a title, abstract,
or
subject index field.
Phase I automatically collects all data and builds the dictionary
files.
-45·.-
4
L
~
·
Li
x
IU.~~~~~~~~~~~~~~~~~~
.1--
u0~~~~~~~~~~~~~~~~~~~~~
ILL
*4.U.
~
e,
* W
a
%ft
,~~~~~~~~~~~~~~~~~~~~~~W
U. or
*
In
-
2
WZ
U
~~~~~~~ux~~~~~~~~~~
·.-.
I I~
X
s~0
/
U.*4
W.
sa
ca
U.4
W0
IA *0
)--.
*U,/llW0
wOZ
Z
I• S
eft
a
*
~~~
W.
O
0 * 4W
.t.J
' -'.*
LiIm 'S~~n
Z LLN
WJ
o
0
4~1*0i' ..J
W Wuft
n).ZZ34OUZ
I-wl-0
L
C
oU.eaOt
tL
40
wZ
>2
CO
Zw
0
00.
ao
0
Iw
-
*
I-Ce
a.
3-
*.I~me o
W
-JUUaQ
)llr,
C
C
Z-
i
ft
In
In
f--W-I-
I
Xt
IL.U.L
-L-0
a
U,
0-L
UJ4 U
* a.-
~
Li1-
Z
VI
I
:J
-a.L)
...
4~
.k
ma\Y
· a
SW~
*
*2In-*r
1 -
-
0
U
?Un
-
*O
~.
O
0
W
I
L~O
*
W O~u
4W IU
OW.54
-i4.*£JIJ
o
ft
2
U.WCIU
Li~~n4
*)
I
LO
.~
4~.
Li.i
a
J.
*1-1
ftft£
0
-* W
*
*QU\~~Iu.C
*a.
a
.
.a
-eC
~9~~~~0·U
a'"CaO
I-I1Inr
O
t
-O-i
--
L
-c
ac.
LI
IS
aOJ
0ccr
Li
a'
a~cr
aL
0u
Li
LI
a
ur
0
LU1.
L
..-`O""'i
*ZLi0
.4
·ZC
II
£
i
*In
LICU --QL
I .4Cl
-4 Z
Z
I
U
L
lM-J"*
--
"L
I .4
4
.~Ui~I
I.
0
C
In
W
'
·VI CL
i-
X
V)
· r~~~0
a.Tr
a.
-C
1-r
t
a
0
LIr.~~~
051-~~~~~
I'
,
.
n
4
n
In
C)
4
_JtX0
~. --
a-
InC
L-I ~.
Li~
ZUL
.aaILf1
~a.OL
~LIIUt
a.
LZIU
Q ft~r a
L
a
a· .
4~ .04
4cu
ON1(
.4~~ .1-·0
u i
LIL.Z
U.U~ LnY
0
0
' a.n4
co
J U
ft
42O
0
4iU
S) iO.1a.ILL~r,
02
32D
.A
S
a.ZCa.
0.1-U
U
~
·
u
0
4
1*1
*U.-
,
£aIJ
OW
U.L4
I0
v
1UNic.aW
IL
0*(0Vl-ftL~.I-1
0(
1 ft
£
Wc.Li3
X.
V,
w.aaaa4~-~..f-LL'J
raa-S.
*.O
*0c-L
* 3-
-
I
luYI
-*InX
*~~
0
O.C£0I
044Li~~C
1-Z
LILIWWI-"I-
InW
0
4
Z
In
<.L
Co~ ~~~ ~ ~ ~ ~
~~0
~ *LiLi4~ *44. 44
f.
~ -0 Li4L4
~
00I
ftft*
ft~
Ont.0In
40
. .
n Z -U
.
~~4
3
2 ~ . 4W2Li3ia1*0 ~ ftf
~ .11
_
~~ ~ fttJ.fttiL
* ~
W ac
0U
.
N~~~~~~~~J
4
~~~~~0a.I*
01Oftr
.4.03a0Il
.-.ft-ft
-0.-.-4f~
u
U~tftttt1W I
W
l.
~
U.Li
WI-LL
*~
-'
44
,,*LW
W 1-t *
W
J*-.J
.2
*
:o
U
U
Z0
a0.,
a
ft
a,--'
4. - SL
-'IL--I
.' *
-XI"
UU.
.
,
I~
I-
A
W
·_
-
0
41~~~~~~~~~~
·
*
N
*
z
U
-
-J
XUJU
0J C IXW
*0 o
*4/Y
*U
·
Z-
leW
* -4W2
,
l
w
Z
oz'
*40 .0~~~~~~~~~~~~~~~~~~
U, J.
*4
~.
I)(
LC
cI
SA
LU
.1
aI
at >. 0
-4
4
L "0 .
z
Z
ftJ.
L).)0
·
c
0-W
c~
*X
1(U.0
-Wi--3
uL 'n
Cft
04W
c ZWWWft:
ra.'-X
Ut
Xa
o~~~~~~~~o
Li
;W
3X
---C
-
U1
-
.0~~~~~~~~~~U
1-1.11.12
0
1(0
n.LLI
. U
4
a,.
.
-0Ix
a0
0ft2
2
X
f£a1-J
r
a
.e 4
0(YIC
x.II
1-C.W
-
Iv
I a
C
£.4o
*W
4
*.0114~
I" a.i1·-O
.004
~----
.14
InwWZ
I
) I aW
In
ofZ
L
WU.
X
2W
I
fo~~~O
2
iL
1
I.u
ZC
~U
O .LIJ
:
1-IO
*Z
*_
-
·C
LiL
C
L,
U,
*0
--
tt)u
U
J
ZWC
W
LU
CZ
0 UL.
0
JL
-l
-40
O~b-
Z
*
Z
Z
00
aX~O
*0.
r0e
a.
ZI
ft
0
0
ft
1-1
0
-~Z
-e
*W4·u~Z4Ov
uw
a.
0,,.4
-w 344
41-'.4
-
V4
W I~P~~
X
cc
4
a. ~~
~~~~~~
In
X ,
IJ W'·U
C
"
WC
+"
4
0
-
1 (L
UL.O.J'
atIrZ
LiL
OZa.
.4.43-IO
).
f
4 a
5 UUIZ
1
IIZA
aUbWf
OP*3 u.L)*Liftflu.
WWOUi.-OCa.OOX
ft-lft
"cOw~r
u.
a..I'OC)
za.LiUiOi4W0.laza
.L.
Ii,.
,-Zft
01-4)-
*0 ft
a
*4
*/
.1W0
InZ
L Ud -.
*0
~~ Wf~C~rt>I-.1I-.
0 z",0
-11
IU LO
O
Zc
Z
L.1ftZ
Int
~~~~~~a
'
'.
a.~
L
0
IW
Lt Z1.
fC
WI
W
,0 luu~u
C411
>c-1-01-~
zft
*' n
W£a~f
cO
*
au
%
£'
W0 Ia...JOuiLI
4
0.
.0 .ft
OW-0
0 Z. ft-0 . 00.WI.I4W
O
3
Li
0
L
*
so
w
Or
ft4
sL~O~
ga
~~
ULU.
3
0
U,
.
i,
eJ
C
Ix
~~~~ 01-
C
0L
--
*
1
-U.Z
I
Uft0
Z
1
Zn
uL
3,
i.:
Z04.-.
W
4240
-I
3
0 Z a
cnom
cO
1-1-.JW
-U
Oft
W L IW
*4
t
W-· ·.
.0>3_U
Li
0
2~
-w
I
*
f
0
a.
I-.I
*~~~
w3 -a
*
W
I
*3
*
U.
0
0eI)f
Zg
Li
W.j
-u.C
C0e
*a.
U, 41W
0ftZ
WL.)f*
U,
U.I
Z~~~~~~~~~~~~~~~~~.
U
*
W.
5.1
- U.W
Li
Z
ulO
er
-
1o
...or.
*
*
a
w
,rr
:0
zooOw
0
-CU
-
C
IVI
In
0UI
.
I~
.i
W0o
.
0
31- U
J
ZtW
.
* ofl
-WI
*
0
o~
r"
r
LiV)Z
.
L
0
cXN0
U Z-W
,
LZi
-46
-
*
'g
*
44W
CU.
0C
Ud
-1
NO*
U
~
'
,QO
a
~
4r~~~~~~~~~n~~
-Q
Ww
-
*
*
a*
:: U,*L,
0.U
VI'A
W...
5.40.U.
43 U)
U,40
4-W
.(
D-1 C
4.4WU
00.-43
C,
~
W
-C
4- 1
0I
U
S-Q
3.1)
434. 33
4~
.30.4
-cuO
*~
*LL
700.
04Z
004-
*
Sa4
ur
~
~
U..5C)U
4 3.
44311w
L
0.0ZZ
cc.1-
.3
-7
W 43.u
3
0.
ZZ
*4dN*
4
4-4
0lwcc~rl~J 4
~coZ0
UO0.
Q I"'ll
0
-9
0
3..
4
-
--
J
~
I
4JI
0.
01.4
W0
t UI
W
~
*
0.)
W~
UN
YU
LI4
-
-
0
I
C
a,
I
w-
0
U, -J -.
2
Z
004440N0.4-QJ~U
'O Z
0Z O
.
44
NN00.L
.JUII4.J-.4
t 4*44
*4.
4U)U.L43
4-4
.70.0
0.0.0.0
U.
U.L
r-t
4-
-
3
0.
tW
U.
0.
0
W UN
4-
0.LI
Ut
U
*.
U
.
0.
Z
4-QU
MU.
to
Z
U
U-
.
Z.
0.0.0
a
40.1
0
u~~0
.
4-0
00U.
0-
-L0
0.
"J
W
4
X
~
)W
*Zt-
W
0U 40.4
'rc--rr·r-4r-*4·-~4
N
*4C:
*WNO
4414…I-u 0
%
LI
0
0
0
3
04-
40044
3
.4rN
.
4
4-
0.
04-
UN4.t.
Q
a.ULC0
0Z
0>
ID
WCJ
40.1
-0
00 -J I
0.
· ZO)W
LL
00.M)
Z
ZWU.
I
4/143
W4OU
UN OZU.0
4 M
UX0A
ZU Z
.
...
z0
W
ZU N'.-a
LJ
-t
00
43
0.W4ZX..a
P
ZU
Z
4I
X~
0.4.)
5
a WN.
0WV
0. . M
U04
V4
I
00Q
33..
N
Z0
*
0u.
0
YNN
4104.U
L~I
a..1i-
*Z
a.-
a.0.;
0.D1U
4-41
0.
04
4
W.0
-
Z~~~~~~
0.UWW
ri~~0
04
.
L)
0.
*Z>
30
ZZ
0LI40
40.0
ZM
4- . N
4-*Z X 4-..L0 ·4C
X
. .
*A
0..
U
4
4)( z0
LI
rU.1
U. -wz3
0
I
.Z.0.
4- -J WN
14W N
430
O
411
U
0 .
N.
0
4N4..4.
ZbO
-J-DN-U,
W
e
U,
0
0
D W.m~
-O
3
40L
Ze
W
*
W
D
0, O
U
Z. U)
4-·
*
X
U
L
a~~~~vrCC
,
u
4
4
0I
0
4K w 0.- 0
U4''J
L
co
PO VI
~ 40z
aU.444J.
YV)L
a
04-4-~L
J.-L
co
U,(X I
X
K-W40-
-
1.
.
3t
X
I'
'
.4--4
0.
OW
0.
U'
U~~~~~~aaL
4-4I
-
I
4-
2
W WU
W
.Z4
(U.4 0at
L.
04-C
of4U.
WKI
.
'4-
40
Z Wa
0 .4-0 a
W
4~~
23
N
X
U,",~U
0
4N.J 3
0
3)4~~~~~C,
-JIZ II WU
4P
m2O
3U..
0a
.
XL)
4r(30
0.0
I44
.4
W
-r~
.4~U
0. +3 K4
K~
*
0
43.L0W
0...J7ZIQ
3)4-)(W
000...0
4
0
0II
C
J
X
or.O
U.I-,U
4
UX
D
IJO
07D
4LU
owZ
43 C~r LP
44-70I3IL
~~~
04
0
4-
0
44
.
-J X
Z
U, W
31 4-
A
U.
0t
7
0.
4U,
4
4a4
0Z
U.
4Z 0~~~~~~-'
-- a.
0z
0 U U.
3·I~
.44
0.rO
0.WJ
W I
0.03
Z9
'A
43
W
Z
0
4
0
U,
ru
a,'4.
0r
,
0
.4
r
~~~~~~~~~~~~~~~~~~~~~~
Z
XC· Z IW
Z
0
AU.D U
L.
W
3
0
4-
U.-.
~
-
.WU
0,
*
0
,C.I--4-C
0
0
O 4-14-
'JL4
aO
ma.
Z
04WCt
Z~~x U
Ut43C
04-
C,
--
)47ccL.4
74-I'
C,
~
4-Wv1XI 4
4
wX.
4
.
X
CD
U,
I-0
a
0~t
Z
U
U
U
0
40e
Z
w
~~~~~OT~~J
~~~~~~~~~
u~~
3U,
M4
.eU,
0
UaC
-
N
Z4-~
M
00X
·
0
.
0
03
4--40~~~~~~~~~C
ccZO
00. .
-14W
be
5K
4
N
.
U
U
,
X
0.U
07
IL)
ZC
Zc-
*
No~
1
4..444W
l
0.3.LJ
0.0. Q44U
WW
0.0.
).LmcoL
4444A4 *0
~~~~~~~~10~~0Y
Z
O.
0
0
0
Z- "
003W> U,
0J
00 a
W
OWN-I
0.~IC
Z
LZ
KLI4
4i-L
-4414 +*
.... C~*~
WW-K4-0.4-·r
a
W 3J c
-
43
.I
*
-~~~~~~~
N'I'U4UJ.c
004.4
0.
4-4-0.4r40.
0l·-. W4-43IZ.4
3
0.0.0.0
WOO
g
.
Z
0.
3----
0?
ZOOuJ
4-O
*
4-0.0
0.0.LI4*-
(.4444
*4J
S 0.4.44z~r
* 0.UU
lc
X4
O*.
W
43U4U
3aW
W W0 44.
Z
O...0.
00.~
*0.
W)ZC
UK UO
Ua
0.orWW
>*
0rX
.-
0
0.
-.4n
L041
44
4i4W41
M I
U.
a
I0
O
0X0
0
i
!:.a
W
Z
%
43W
0
Z-4-N-U
4-
W*
D C
OWn
07
03..
'S1
Q9 .
54'Ld-04
*a
*
Z
%
LYZ
133
4- U,
LU
ZZU.0.-
AL0
X
*U.cc*
M
0
ZI0
O
c0
CW
Si
Z
cc
C
-aq~44
-zV)
.4M
0
KW
44
w07
.0.
-47-
x~~~~~
4
0.W~~~~InYL
W
Q 4- W
U
4
b
4
U)
0.
0
IL,
U.
o
-
0
-
z
0.
4
3
3-
4
co
z
4
x
u
41*4-
0.
0.
4
a
£4~~o
2.
-I
Ovn0.In3
U
.J
-
4
x
.
4
-
U4f
U
0
4-2
IU
U
u
4IV
4 .32
U
4
0.*.J
44*0
In
*
2.
w3-r
2.40.
InZ2
*-EWY
0
In
In
4,~
.2-4'--
4
Z~
4
W44
U
0v
.45'U0.4
..*"*
~
..
JW 4
~
~ ~4(0
4-.
Lk
0
3-
UU
£4
z
-
04
*
4
UY
2
0
0.0.OOU.40-400-Z
0.-2.
uD
k
40
0
L
4-
-
4W In lo Z
z
9
O
4U
0..
InU·~
0.0
W'
.32 .z
o
0.
4V'o
4
*L
cx
z-
mc%.
*2N
0
2In~Y
>lLUU~.
w2.ru
4-
4-4
.
WI)49
4
U,
*U.4
0..02.
~r
OWc 0I 44
0.~Ic
0...)0U4
*-..
WWL4U33r 0.-4.T.2
·
4-
..L
4
4
Q
2.
.
W
.
0..03004
04343W4.~
-42.
4
14
.
4-
-
4-4-4-04-4/43Z~
4.3
LLZ0.L
4-U4-Z
4
4LI
.30.J~c
Ucn
U 4d443U*,
--0.WD340
4Q.c
034
4
4
W00I0.ZnU
U4
4..4..J .L4-4-4U.4U4-4WU.Z.3a-
4
444
UwL
Z. nU2.4Uvn
.
44o
z44ZW)U442
z
cc
4I
T
014Z
40n
4*
I
I 2.0- 1
W
).-....v
~
0
WUU)
14 I
.4
UII)
£ 4W.I~OZ
cum
0.4Uo0
InWZW0
O4v
*
ZLO
3
u
-ZZ~W
ru
uO
4-.2
4-*4
L~3
4W
4--0.U
O0WW2.r\-3
0004Z MmI
0
.ZZOr d
..In
44
4 a
w
Z3L,
0-?
OW.
)4.
.)
U.
ZU
0
0.
U
U
U
4:
Z
-
90~~~-Q
U
0
0
0.U~ (3~
OU
2W-*
4-W
U.~~~~~~I
a.
*4 ZL LVI
c
0.1
.3
0.
414 14...
000
-42...J002
9
U
4PW
-"
~~~~~~~W0
U IL440.~~~~~~~~~4
..J*0
4-UUU-4--Z
2.2.
24-WO)4
Ob~~~~~~~~~~~~w
4.
3
J U- I0.
3.4.*C3
44
40~~~U
4a
0
-+
-
U.QTc
4-U
4Z
r-In
-
ZOZIZ
44-U
~
.30
44'
02.In0
u
2r
c
0C
I
0.
.0 40
4
-·Lt
2.U
L
24-.0.
-I
4
..J*
LULILC
I
owclauO
1W(
·
-2
N
I
0
Utc
C
~
Q
44
V)4
O
OwOr
O~~~~~~~~~~~~~~~~zIL
I
WO..J
4--24 QL~
4'-0.I*U
2,U-3V
.
*IW
44-2
-O
'
W0.0.4
4-~nUn.4U.44n
C
U·
4
-
-X
0
4Z4.4
4
.4
In
-2.0
4
0.2.0.in-.22
4
-N
*-44-.
U
---4..
.3
0.0 4.
£+
Q**
.WaIn*4
C~YCQ
4
W3WU.2.
4..
t~
4-U
a4
Uw
0.1.X
0
W4
44a
.1.-
U-3I4W
4urr4'.O*4-F
La-·lcn
In
*
2
L
UWLUn
2.4
4-
*
**
W
4.30
4U.
.4-*
4~
p
44
.3
0
~WC
Lm
x0.
2.-W£
4-.4
-1
~~0.
40.
U
P
In
20
U
4-.
*4
4
4-Z0Z
0. w~r~
xrrZ~~~~cZ
24U
0u
4U.0.4~l~~
*U
aU
U 420 u
-.
1
0.J
4.
0.
U
*.JW
.34C3
4*
£ 4WUOO
*4U2.
0In
0Z
4-40
0
0.2.0WU44--
~~~~~~~~Zw~~~
U2.
22
-4041W~
r~~
0- 30 0
0.0.
0.U
,
01
£43C
Q-OQ
O
I
*~r .
."n
W0.I
~
- W) r -
-T£
o3
00
4-4
~r
44402
0.
2
.J0
0.4-
-u.2.I-44
UY
)
-.4-4W2
WI.In
2
*.
-
0.
l 'aU.
44.
-O
.2-.
0.Z
Qn
U,
W.
W
Y4 Q
I z
2cc
4
-2.0.
0Icn
u,
4
wun
WIL.-U
Z4-U.
U0
-w
~ ~ ~ ~~~~~
~ ~
arU
W "I4
~~ ~
rr r
0.
3 4
2
o
3404-.
wu
w
413
Z)*u
£44-
-
~~~~~~~~~U,
44.1*
Iw
41
2.I w-2
cc Z4
Lu
a.
OZWuOU.2.00W
0Wvc430--0U£
>.
WIn
4
4
3
wo
x4W 4-QU
0WW2
4-7IW
2
QUU-L
-"1
**
44
o.
4
N
U.
u~~~~~m
4--4
0.OW0.
0.4
2.
LU
43
2-·U
0
rw
ZU2.O
4wt-W4-
U
4-
4
I
*
4-
03
·
*
k444
.3rr
*4
.4-44.4
U.
W0.
Z
In .
Z
4
4W
wD
0.W4f
Wu
3C10XQ
0.
0W0.
.34
4-
4-44
.3~42.44-4
InU
w
2.
44
-4
ow
0.4or
z
..
0.
U
U
.
0(3
-.
-x
-
-
Of
at, 4W
I-
34
4
Z
1
U.
U
0
*
I
W.--.4U.4-
*
4
0
2.
4- *j4444
G
4-
YUJLU
*4-~~W
z>-. DU 04W4
O£4U.
-r
2.0
UI
4-0. *In2.I
44Cc.3
*g~
2.
4N-4
01--U
~Q*4
~ ~~~44 ~34.3
P
ZW-o
U
0
2
0.
In
4.U£4.
w- 0
4-44-
~-z
U
rxwr-.I-4
Z44-4444--0.0.
.14-0.
4
£40.44I 44-
4.04N4--4.12..0.
LUr
£4
UY
-O or- 2
01
U1 -
O~rw
3.12
2.-
0
4
woz
0.
o 0.U
WoZ
2.
2 0 LL~ 0
.·*Z
43
4.
0I
0
Z
4- *~w4
+
£4
2
4
4I4C
-~~~~~~m
* 1 0e.
3*
*4-
4I
.3
c
0
40
4
44-
-ZIA
0W
W)
WU(W-t U
.
4 *U`
44
..
£4
-
0
-
.
2
4
0.
4In
U
x
7
9 LW
*Www0)
Uz
4
..
o
Is
4
Za
Z
ur
0-.0.
.d
J
4
41*
4J
-l
4-
cc
~~~~~~~~~~~
0
~ 2
4
0
O
0
U,
4
3
2
o
-
U.e0. of
4
2.
14L
24-0.
U.
U
WY
2.
0.
-
30L2.
W.0
3.2.
0...U.
.
U.24
23
0.34
U
w
-4
U
W-*230U
2.
IUWUt
0
4
4
P
InW
U.
U
4444
-48 -
X
-
A0-
0
W
-
J~~~~~
0
4
06~
CW~~~~~~~~~~~~~
2
2.
u06
0*
*
0-
3
U
UU
u
0-
0
W
U
00
4
W
0.Q
0
Uu*I
tUQ
U
W
U
0
W
c
0
4
W
W
W
-
V0.U.
M
-
U0.0.0.0.0.0-~
M
MM
Z
XYYL
,
mmmma
0-----·
3333300
06006006
U~~U
0-0-0fl
06.
NN~n06000
X
0.
wr*S Z ZZ
01W
Z
CD!
"I
~~
~~~.0-I-.
U,4*
W~LYYILUYIW
-5
*
0
W
.1.1
3
3
-
0--00(
U0.J
00002.0.0.---J~X
3
06
-
33U
W
32
~
~
M.
.-0.
",
-z Z
W
.33 UQ
ZOA.
*-W
-
0.
9L-U
W
Z
06U,
-.
I~0I IU
..J4*4*4*....
*.J0
UZ
1U04
W
0
U
0
U
0-
0.
0.
0-L.-
W.
W0
4*
oo0-00.O.
06
I'll,,
.. I
U
U,
32 :
0
I
M
L
0t
AI00-0U
Ua0ILZ
..
I-U,0.0-030
.
0
W
~
0-
-X
0.
A
0
0.V) .
LU
U
.1
U.
4z
U
J W
0Z4
6
U,
4* U,0
060
C,
4*0..
14*U
0
000
LUZ0-0
Z
UZ
.1U,
4
0.~~060
UJ
3
X
ZUJOXXWW
0~~~~~~~~~~~~~~~
3
-~~~~~~~ao
Uur~~~~~~~C)Z
OX
U,0-LU
U
L0W0 U)
UaU0
U,1.
-J-
f40606
U
ZW.1
7
XU.
2
'J
32
X
W4
U-
Ua
X
4
20
,
06
6
06
*
U.
~CZ ~
~
.
U
~ ~
D U.
XU
O.-t4
U.
V
4*0
Z06
2:
U.
000
WW0.~
UM
ZX M
~~J.
0~-.J
~~0
n
U0-
0
L
.
CL
'JU
cc
~~~
6
SO
U
0W
0. 06'U, 0
Z.ZW
-J 0.
4
W
0-
0LU
006
0-
X
060-
06
0 .
W
0-0£
C,
6-Z
404* -
0X
X
Z00
*
Z1 Z60
Z0 0.. LU
at*.-0d
----
6*,UQ
60
0
.U
U,
W0I
0
4.
0
UI
LU"IZ
0
U,
"I
0X 40
W.0UWU,
22.
cc
0.2 0
*U
Ko J0 Z
30-
0-t
2.~~~~~u
W
0.
a
X
U
o
06
*
6
0;
-J
2W0Z0
W0
-Z0.M
-
O
X--J"0-"
0
-UZ
M ,
L
W
-X
1
0
0
.
0
6
0
66
-
L4X
66
-
0
M
4*
W
U.
3,0
-
LZ
000
X
0-
X61.
X.0W
-~~~~
02~~~~~~(
W~~~~~v
2.
a.
3.
0
*U
W
..
0
0
*60
0
0
00Xaa-. ;
Z0.66~
a,*-0-~60,0~
4
Z
*
ZUU,
I0
Z
J
-J
U
U
06.00Z-I.,UJ
02
.
006
4
-.
WC
W)U
330
0
-
CD
~~~~
U
ZW0-W
--04*
0.06 2XUL
0. 0up
3 4*3
OW
0t'
.
4,U
~
0
U,
v
U
U,
.0-.
XOAJ..U
'*.
U
tC~l
0. X
X0
W0
X00-
-.- U
W
W
WLUO.W
.1
Z-
00a.
Z06-06
0
J I
0
6
W0
WJ
A-
~
0
0-0
W
Z06%X
W
-0.2
X
0.0
-..04
U,
W~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
X
U
WX
X£
W6((L
JLJ
J
U.U.
~ ~~ ~
~~~~~
0
Z~.00AU..L
60
W--
~ ~cZ~ ~ ~~ ~LW
X
It3
ZU,
0.
0
UU
4 *Z
06
0.·
U
4UZ *
~~~~~
-
*
.0
2U,
*
t0.
Iu~~~~~~~~~I
UJ~~~~~
CDZZw
XZC.
WOO
0-4W
X
000
W££0.£
4*£WUU,06-Z
0.
00
Z
W.1.1~
X j0UOaU,4
0.
4*4*00-U
4*
W60X0
I
*~~~~.UZOL
3 4*2W322
'
MU,
ICICW
04* I.. U,
UU2.
3.062.
0-.
.1 0-. .00
XJ
2.02. OU.L - 'X0. .1
£ 1.03*00.
&
W..J
C,.0-)
M06Z.W
06(20:
UUU
20600.12W
LU
33UwW 0406..2Z
U 42
XU UZ
X.
ZJ
0~~~~~~~C
X "I
W..J
0/06
Q60
0-
X0-
0-
. . .
0.Z0.4*4*-o
33301.6(U
U,
.W~
-
WOXW
et
U0-U,
0U
0-
0
~~~
~~~~~
0~~~~~~~~W0
~~~~~~~~
J
0.
U.
0
U,
V,
3 2XU
I- X-0.4.
.a£U
0.0.06-
-
W
4~
06Ut06040'ULX*J0J
U
W
J.1
W
cc
Z
0.Z4*
W
~~~~~~~Z0-W0-0.w
~~~~~~~~~~~~~~~~~06UVI-X
4*
U
0
0
2
. -
0
w
J U,
0U
U0.o
0
U
c~~~~~~~~~~~~~~~~u
~
U
.L
ON010
10
ZW
WJU
M M WOUU..
0-0-0-U
W....
".,
"IL
W0-00--06..06
2
C
%.~~~~~~~~~~
'J -J
2.~~~~~~~LU
C3 )<
0.~
C,
**4
w
0
U 0.
0.0.
I,
0
.1
4*0-W0.0
UU
.
-
-
06
4
U,
U,
061
0U,
U
r~*W.I.
W c
X.LL
.1LXU.00.
uZWM
/06-~a
.O X0-·U,
U
~
U
U)
W0.
W
L
3
0.-4ZM
4
Z0W
0
0
Q
2
-
w
0,00
A
LLIL~~~~~~~~~~~~~~
rL
J.
4*
20-20.
U,I
0
.'
00
U
i
U0406
£0
"I
cc4
W U.
..
+W
Z
49
£
*Z)1
L
00.
0
W00.
14U.U0
M
0
*.
-J
0
M
4
WU
0
4
M
CL
.1 3606N Z-' I.
J..
I 10.
0N
Z W0.M
WJJ
J0£,06
06 0606.O.0U30--0-0I06.*-..J0C
3
P
I.-
W
uj
I
U
20
*
0
~~~~~
~Lc~t~-t
~
4***
-
W
0 06 30
2
*W _*
0
U,
6
ZZUa
0
0.
4*o r~~r,
00-4W
0
I~m~~·
0 a -N0V~~n66.
-,..J.
*.W
6
W
WW
0-
0aU,
X-340
4
00AWZ
it
UWZL
2U
.1
Z*I
£0
O
I
WXO
"W
Ucc
ID4
4
Z
0
UW
XM
w0WU
0toat0
W00-~
UM
M
Z'
Z
v n ..
I
02
- W
Z
-
WWU
>10
~el0.
oU.
X4
2-06
-IL
IL
*
'U
Ur
WLI
0C
0
UU
02.Ww
,
0-a0s
OX
.-
U
0-W.
I 2*
WI-
W
U
U
C
W
U.--W Uw
"
I"I
UW
O------0-.1
.
*++++.
U
0-0-0-0-0ILa0 IL0.Ow
-at
3
W
0 J
NJ666666U,
W
I
%Z
*~+,,O
U.
U
Ua.-
~
·Wtf
3
X
0
W
IZ
3
Z
X0
0
Q&U
I
.
.6
W60
X0,...03.,
U
L
0
U,0
..
-49W
*
·m
:°
*9*·
~~~~~~~~~~~~~~~~~~~~~~~L-K0
4
S
*
*1
W
W·0
4
9-4m
U~v
3.0
SW
C)
.9L-
4
.
.
-wc
IL30
63.
*J
*
5.
*
)
W3.4
u
U,
-03--i
0.0
.
-J
60
Wm
4U'A
.£
*
O
*-.9
*.9.
* U9
*
uv, 9
*
Z
9.
1A
3
9a. Q
4
W
-I
9
I uI
U
4
9
IL..J
4
*
*
~
4-
9
CU
0
W
-r.9-'*2*
4949J
4-N4W4·
49-999
0999
a
v
.4
- c
v I
u
£9~u 00)-'M
6-0.9-4309-093.9-
~
.W
U
9*0r~Q~uQL-0
U
33.
990
9-094
UWOW
3...9JW
W
4L
U -
a
'9
'L
U
6
00
v.90
U
40
00
0
40
U
O.
UUU%4'
4u
0
U
U
.-I
U
9
0
U
0000
U1'A.
00
U-
4444U4U90.4.40*
NNNO9.9
000C009
9-7 OW-9X
933J33S0W3333
I
W99
09-
0.~O .99.)-4
90.
0.00
09--9
3900
~~~~~·N~n~~~uI
93.
9-
0
-
7
W)
U9WWWW
*
W99.9WW
73.33.3..93.3.33.3
.-.
9--9-9-*09-9~-9cl·-uZU~~~in-
Q
w U
32
4
U
W
O r~d990-90J994 .1wC
4000000
'00000
0.
.. Ja
9.
. 4 .0
.. w~O~~~ 3..J9.-99.
W0
u9L.r
0.u
9 J9J9 JU 99...9W
W-*
U
.5~
9-O
9-
9-I
9-
Wu
W
4W0.
49-
03
0999
IN
03
099
UVIA
9-00
AIU-
9-9
22
WW
04331~04
0.*I
9-U
7~UVU~
*
0442
-0.9
0
6-
0C
4*
4
6-9.9
r. o
Zw.
310.2
0
*499
9--
Nl
..
N
-~r
4
---7
WU.0.0.
=00.
U
0
0
A
.--
5
*
2-4w-99.,-
6 9flO~wO0.
3.000w...9Z.9-
WI
1W9I
U
?27
4
70.
-9
*-WZOOU5.4WOOZJUbWO
w-w9-
i
0W
-J
0
4.9
300w9-9-rc0.:v;
J5.0
wI
4,I
4
99
.
46
X
0
00
WO
3-Z.9
1
.
0.Z
w
W
4
j06-0
3
3
0.9.92w
4
L
25..--3J.
9Zw1q
0
W.)
-
3.
Z.
9-.90003.5.04792:
-000-5.300Z..-...90
DrOoo.9*vu
*M
rWw9J
9-0
U
4
4"WI
7wU
9-9-
Z
4*
2-.
-.
WUO .32:.99
3.u.44*4..j
---
9-99-
0A
9U
w
OW
(LI
9J9-9*
4.0. I
000
9-00-~
-Z
033lur
-D
6-
,.
;I4Z4
w
0.
9-
~~~4
0.U
4
-J
:
909
C) a'.0Li
9-
owr
9.WUV9
M&Mom
y21-
.
W
3yy~LU
-
w~z
4
3.
.9-
9
o
Z
-'r
9-:
c.z
0
w
9.a
'JV
OWL
Z
0I
.
999
0
*4L
2
-Z9.
22
ZO
0
9*0.
-w
0793wU-0.
0
0.40.
*
9
-.-
Z
9
ex
-Z
-C
e
9
0
'L
r4
0..
0
.92
w
03.
0.
.
a 9
33
9%a...9
'A Z
OJQ9..
020.-Z
W-bZI
.4.0.W Z
*.49w~rr-4w.
0.
0.0-9
-2
4
-I"33.09d92:
9..
0--
J W u
.WI4'-Al*9-.99-9
*wwO-49..9
goCU0
00
9-U
U, 9
W.
*.je
UQ000.2:
*
0r
*u*9 0~~ 4....
62:
Z9-e
I.
W
02
0.-
94
.3.
-
U
Z
9-
fx
4
3
7N
4
U U
Z
0I99 . 2 U,
4
00
1N,0U
4
9U
00.00W
0.
0Z
23.239-0
WW)w
0121W.-J0.00
3.
*.
J
C.
6-
0
9-3.
9-
w
0.
ooo
4
0at
X-U,
Oz 0.W9
9..)3.9a.
.
0 '. c
49-
0 ..
J~
4
9
3.
z'uw
.0
4W Z
U
-..
.9
CZL
0.0-U
U. "UJ1-wMuw0?.9-9-b.9-..
U*..30WuLUU
0 .
*9.5
w
w
-3
WZ
00.J4wi
00.
*
U,.
U
-wz
9ruw
04Z
U.
*
00.
-0
uO ~ 9 ~
~
09-
3.
9-~
7
O
:
ZZI
* 0.00W
0.
93
wu
W3..)
.43VI
3
900
03
ac~
U,9 Z
0
=
29-Luc
09-3
02 0
0
0
.
*
5.03
*
4
W
%
z0..
9
3-
* U WZ
r
* wf C)0
*4
wW19.4
*3.
3.0.
ZO
'A.
3
*
0-0
* 9CD
*o..
09
443
a
.
* z-.)3
0.
Z. u
w
*z
OW3.*U.
S9-0
.)W
*O6-4
..
.0
WU
.J
4.
.
0.
-9-
.9900.0..
N
053.
9-.
CI--0W
9
U-
,*40
-.-UJ2:2:23.9 3.*4
9..j
9- ZZ7Z-.
.4000*3.5.
2:0002:5
000649
*9,
0
U.
0
5.IIv
.
9.9
*Y
3.00-
0
0
V*~OUtJY5.
0000.5...
-
P 0L
9-0
9-
9
42
Owr
0
0
0
0.--~~Q4~UUU~~YCI-I~
0
9-
w
3.0
u.
0
0
4Z
9
I
2w
.0
* _
-
*
50 -
so
.
a.
-3.-
rS
*
9'
Z
-.
@0 IL
64
*
*
0
U
W
tD
S
o
)I x
S
a.M-0
S44.
*
*0
*
CL
wILZ GL
O -W
*
*
U
4.
i1
0.&W
4
.Z
W
*Pa
WILO
W1 -
* O
2
*oa
*Z
OXQ
>C
f
a
*3
*
a-
a*
*
0
-_LIC.
3X
*-
--
2
0-
-
C, 3
-
*
O
O
O
W
WZ
4L
0
t
JU
.
-,
_
_
0.tL
.L
3*-o
1
X
0-
ZW
I..
MJ -N
0.).It
L .V
I
2
LI
':a f
Zw.4
S
-
Z
a.
ar
2j.-q
- O
LW
S
01-LW
UN
G
2
WWNL)
2:w
*
1-U
4
OW
S
vx
ZLI-Z
X
>- Z-Z
-4W
4
IL)-
I
U
-
I
O
2
rUN
0
I
u
r
0
j
L
I
*
-
0
Z
-
0
*U
1
UI·U·-
I
UNN
O-
Z
-
--
2ZLO
I
4
S
I
@42
r
a
'I
2'4
-r
N
4
LI
t4
W
@
L
W
.L
9
N
I
ZLU
--
W
ZUN
C)
Loc~~~Y.L~Y
VN-4UNI-
w1
L
.
£
LI
LI
.
VUN.
2
*
. ZN-
Q
49
-
ON
0-N0
0a
i
.4
N_
M"
C*Z
4
.JZ_-4
Z
0-ZO
vr
W40
IL-LI
C
L
-
4
1-
w2
2
2
w
IL
W
IL
wI
NM4I
L
-
W
_
'e
2
O
a
O
3L30
C,
a.
2.
a
r
N
2223IL--LI
4
3343LI34LI30
XZ'-.42Z01
1-
0
-
1-444.
-Y
1-
0
+
NI-
Z
IL.
I Wa
aO
2Z
IL
OU,
Q
IL
-
OL
3
*a:3 _
IL
4
*
I-W
N
W+4-Z
ODL
Z VN
2
Z
ZIEI
0-0.
04-..
ra a·
O*Z2O 4..~ OUNNNLI4N
QJNO
-O
Zaco
*^ 0 ZCL+
_e,
V
S (> ts;
Z_-49W
0
.
-0
.JN.+JO-O0.32
.9
11a.
NZ1-09ILLI934-UJ00.
A04
O
*-4
4*4NLIZI
~~~a
I
U
LI
I*
^. ..
*M+w
1-2:
.W
Y
_
W
2. Z
3
IL
IL
IW
2
.
9
41
L
_
O
- - -
*
-
a.
LU-4G)
+
04
444
.
4
-
-U
"
L
4
U,
I2
*
*
1-44 w-Uz
.'
2W
*W
Q
Z
WLIU
2
LILLIwa.)
0LI
w
4
S S'
I
>
a.W
-1
i-LIN
>
_
ILZ
G
X
1-1-01-4WOLI2IZa.00.U~a
YIIOU~
IWL
Z
WZ
ZW
*
.
0W
O
0O
r2
*
N,
4<
3
04
Z
CLV~)
_
WLI.J W
I-I
W
(3(3~eUZ(3WLZD-4-
4
LWIL
ZZ
W
Z
CL
Qw
LI
,
z
Z
~
04 Z
a.
20
C
W
L
O
*.
LIWU
L
0-LULIA
-O1
0u
.a
.ILaa
Cf.Z
O
O
YL
NW WL
O WO
G
ZILZ
%%.%U'.- WILDO
'444
-40~0 W0
.DX
Z***
Li_
OGt-OO
e
-a.
0 .4LILLI~)WM*
..
J,"
04ILILI4W4QP
W2:0N >
'A
IL
~
I
.0dZ C
*@3L
n-UN
Owa~o
4
I
W
C
ra:
VW
-
zv3IW
WIL O
crOZ w
W
4C
W rs
Sch
W
W
4
IL
U
-
W UL
0aY
a.1--
W
L
)
-
a
UZIo
LL
C
zZC
Z
I
^<2
aX
Z
_Z
(JIL
W .Z
~~~~~~~~~~~~~~
U.
It
Of
N
IL
I0-Ja
~
+
C)r~
1-0
UN
D WW tYN
OZOZ
<
-
W
Z
aW
Wa.
U.
4
4
W
La,
W
-
W
I-
vt
W
.
X
ZZ4.-
1-
IL
1Ja
N
'
0
. V 0.
*
3Z 0
L
*O
G f1
'
a -
4
a.
-UN
4.-L
OOwZLI
0
0 o o W4UJ
'- D -- t
2
,W4
Mla.It
0U0a31-.4-D
N 0
o
N
1Z""63xo
t
ZW
W -OIV
Z
aW aL
0
tL Z
SZI )-41I
W
wIZ
·-IQO
W
a
* * *.IW3
4.
*- 2
21
UN*
N
-o- * *Z
0.*WO
a.
*-*J
C',
Z
Z
Z1
Z
oZW OZS
W
I
I
Z
OW
I
w-W
-X1-IV1
2:
I
4
0.14.*.4
0IXt n
:_ .W Z
*
-
Wt~~~~~~~~~~~~~
Li U,
Z C
1-22
O
XL<
Z
Z
4
*
I,0S
W
1-4t
a.
0-
CL
L
^t
Z
4z
WU
X
ZXLi
O
O
-
O _
Z
J o 2I.
* o --U4Z LL*
WID-I
WOO*
_
-
L
2:
*
Z
MC
I I.0
ZW
2
0
Z
O
Z
*
X
SU
W O
IL
O
-
t<
C
Z
W
M
UN
I
NW
t
4 .o
UN
0
0
U
0
*·
* *O
UNNUN
*)0
Na
zA1I-
x
4<
-4--MLIt
W
-W OO
*a4
a*
30
tl
L
S*
*
L *1
I *U
*
Er)
C> c
a
4
I
* 2 O2
Wa.1*O
V
-2L*+W
LI
_t.I4.LD
_A
_
a..2u.C*,Q~
L\L
aCa.ELa.4.- 9
a.4-I4
WILL)
-a.--a
a.
tC3~lQ4-O
a.IU.
a M3-0
ILIJ.
W-I
1WWUl
N-3
a
*
W
· t
o 0
K
3 't
O
~d4
0O
*O
*
*u.
*2 1
U
U,
at
a
' *, *
y
0J Z_
W-UV)
-3
*
a2:2
LC
304z
W024 3
0.
WW
2
-'0
*
t>O
-wU
a*v
)
a*
X
*,
WW
.
0.1 O
-·
^
CC)
*
s 0Z W
_.
*
>
t,24
S
* _a
Z
It
IZ
M. O-*Z
L.
.Z. C
a*
*'a*L
0
W cc...
3W
a
0.Z.
O g
tIW
*
.-
Wu,
a
a
L
*
a
a
LID
O
Z Z
LI
W W
-51PHASE II
The programs of Phase II are very similar to those of Phase I.
The structure of TA
and SI are nearly identical to the same ar-
The dictionary used as a reference for Phase II
rays of Phase I.
is just that generated in Phase I.
The major difference in the pro-
gramming for Phase II is first, that rather than place data into the
dictionary,
Phase II extracts data from the dictionary.
Second,
Phase II must operate on the extracted data to form a pre-index.
Thus, changes have been made to the
arrays,
P
indexes.
and
D array; also, two extra
PI, have been provided to help in generating pre-
Thus the directory array, PTRS, for Phase II contains
two additional entries.
PTRS(14)
contains a pointer to P; PTRS(15)
a pointer to PI.
The
D
Array
The
TA.
D
array is still used to store the usage data for all words in'
The usage data in
D can than serve as an immediate reference
for evaluation of pre-indexes.
word for
D
However, in Phase II only one computer
is used to store usage information, rather than the three
computer words used in Phase I.
The
P
Array
The data extracted from the dictionary for all words of are placed
in array
PO
One computer word in
The data placed in
P
P
is used for each word of TA.
are the usage rate for the word and the frequency
of appearance of the word.
The
PI Array
The
PI
array is used for flags which mark those words se-
lected for a pre-indexo
The rightmost nine bits of a word in
PI
-52are used to indicate selected words.
Three flags are reserved for
each method in order to handle three parameter sets simultaneously.
The
PARA Array
One further set of data is necessary to operate Phase II.
Three
sets of parameters must be specified for each pre-indexing method.
PARA holds 18 parameters in total.
Method I requires three param-
eters for three trials; method II, six; and method III, nine.
Operation of Phase II
To operate Phase II, all Phase II programs must be loaded and
started from a CTSS console.
FILE must be available.
The files CATDIR FILE and CATREC
The program will request the user to input
the parameter sets for each method.
The
parameters
specified as two-digit integers on a scale from
1 to 31.
must be
The pro-
gram converts the parameters to appropriate usage rate for frequency
thresholds.
Once the parameters have been typed in,
quest a document number.
the program will re-
The user may type a four-digit number.
When the pre-indexes have been run, the program will printout a
summary evaluation and wait for the next document number.
To obtain a full listing of the pre-indexes for a document, the
user may type a
1 after the document number.
The pre-indexing
system will print a table showing each word of the title and abstract.
and show whether the word has been included in a pre-index for each
method and each parameter set.
-53 -
.
0o
W-4 U,
U U.
cc
.
x
.44
.9
Z
a.
-04
0
U
V-
0
.
a.
W
0
W.O
UJ
C
I
C -ZW
-?
_
- o
4W
.
-J
~~~~0 ~
4
4.0_o
-'0
9-
_
*
uiZ·
U.
~
I
~
~
Z
Z
_Z
Z.
-f
.
.4
f
>zz
O
Oa
u
2
*
OU*
-
*
.L
*
0W.
*z
4'^
*
O < Z *.4 0
.
4U.
s
CsQZ
0. L_
0
W
viW
0
I
U
a.
0
CL I
.
.
_
t<YYWU<YXWO<YXWQ<Xff
0 La0.
0
F
2t
*
4O.- P-N
mU
'
0o
4-
4
4 . W
00
>3s
2
20
0o
4-42 U.4.-2U.
.
_
*
.-
%O.
:
o.
"
._
-4 .
_
2 mo.mU
X.
O
O.A
I
-
W
UU
_4
-LU
O
.
*N
X
'
0o
LzW-
V.443
N.Qw
4mmm
*'0
Z
.m
m
ZW
o
4
-
W.
-<4--
a,
U W :- n
_<.,_,
1boC*XN_
h0
§c
1
aX
OX4
MLJQ_--We
0
*
0)0.
W
W
~
-
U
0
92
2
W
~£ ~~..4
-
C3
U,
Z C;)
0
* g
*N
*W
~
V
Tc.
444
U
~
.
-9
*
0
Su
~
-
C
*2
~
~
<r z~~~4,4
* S
*2.
*.
at
.
. _ W
0
-
''0
(
W
_
_
-
·
,
.4
.4
F.CC.)
_
*
y* r*
*
.
.W
-f .
_
..
44-
_
Z.
mQ:
0.*L
O-0.
4 -W
0
4,
.
Q~G4-
-
-
X O
2
a
U.Z
v
f
L)
-54-
%.
4
UI--
~ ~~ ~ ~
CD
U,
W""'
X
Z
Z
*00M
1-40
0.0
WU,
oC)
C)
0.
~ ~ ~
2
Z Z22
ILW
M.,ZZ
I( X
-
-
.-J
.4
WO
INI-0
:-
t
c
~~Z
L)
U.0...1N0(
0I.,
3323U..
0
00*JI1 00 .3
I*
j C(I
- 01-0.
21JU.U.
ZLI
c~-
n4
0
2-X
£0.
..
LI
-
0L
2
*1*Wt
*
W.>O
-.
0.
.
1-
.
W
0.2
0
-00~0.
X. X
U
>
U
ZW
2X-I
U ;
C) U
0.
ILI u
CO 'aC
.111-C)JaZI.ah-Ua
ZWw
)O
~2
1Z
*.Z4
Z .
WW
C- 00
3- ~'xw
U.~O,.-
wO
>
C*23>
aZZC
C
Z
f2IZW*U
*u2
h-i
Z4
.0000-U.
L.
·
C
Z
;IU, a-U
u-'
U,
0.,0
Z0U
I
-Ow
Z- U...t)
0,
I
U)~..OU)I0. 2-0
U,
'%OU,0
0
4I
0.
&Z -00b
1.
'
2
0
U.
0..
'LL
U.
O -.O- 2
LI
0.
j
.
U)
- )J)I ...
>
0-.l
L -0 ZI-Z
2. - U 1,
U.
2w2I.-U)
W
;.
1-J
QIC
C
W
.>..2
C.*ZwU
W
U. *U..
UOC.
M.1
U. r-I0
c X
0...ZZ
00.
0
ruO,
U.
P
1U. *XI.L
C..JCJ.3
WOO
or<
0l...
Z
WO-
LIZ
0
CcJ
.
1-.
22
-
W
.
1
: 0
-Z
U.
V
UI
LI
%.
?
0.
U.
W
0
4
2
>
U
LI
U.
a.
I-.
*
0*
1-1-10.
W
.00
N2U. 0
-Iai-
-WO
WC
01-Z
1- U0-0.
U)
-0
20.
204Z. Z -
O. U,
L,,CC
CC
U.
$
u
0U.
1-IU
.
U
W2£ U
C)
0. 1Z
w
U
-4
a.
U -J
*
U.;
Z
0.
U,)
II-. Ww
:, 2
3.
I~~~~~~~~~~~~nI
1-2
.I
L·
1-I-
0
,,,,
- r-9
U.C10
wC
~
Z
U
1-
Li
·
.
Z
i=, i
U
- *2 Z
(~~~~U)0..
.
1-0W1In J
.U.-.
ci
"W0
w
LIM
-..
jwwO
22U.
1--
*.Jw
L, -,
0.4
~ 00~4
C
CCC0
-*4-JO* W
-ruwUcunfO~(AI.c~r~-~-I...3
O--(3-0.Z0.0
- Il a
-WU-h-0
-W0.1-1-1-1-(0
oQ..-0
0.0.
a
0.
0.04L4
.0
3CO9h0.
3CO
h-J
2£
£
LZ £
UWL--U.UU
£U-l&-
0.0U
WY~
0n
U.
LIOI) 00.
LOLIW
0
02
Z
* -
0
cr~~~~~~r
U .L0(3.0
I4~c£U~~~~~~~~~~~~~~~
LII.J U.WUW
(0
0.IU
a,
-J
-~r
C- -2
-
U.)
*aL~ur~~r
-O~arO~~(LO~~
000.U)+*~~r*LIc(U
.
0.0
U.O
-
a
U.0.
I.a..*-I0
-ON
*...W.1-00
000.C
3C00..)
U.
I(AU £
2
U.I
0
I
a.U,
1
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~m
CD~
Ca.
$*
90 Z a
* U.
a~~)
In
WLU)I.I
2*' u,.I
40.0.OZO0.0Z
0.4
1I
UI.
U.
£U.0.0.~
1-
-
aL
@0
.
.
-
C0
~~~~
~o~~~~
z
*
U,
0..)
LI
I.%J
0-*
U.
U.
cc
I.)
d
C)(0(0
2 p.
U.
(0
.
.3
0
I
*-
(UIU
CZW
Z
U.
M, U.
.0(0-0.1
WX
-.
Z'
..
I
0.-LZ
::C)
'
2
.....
uw
0..
(01-0.0
0--
..-
Z
CZIU.41
Z
ui
>.
:1 X1-2
-0.W
0 90
0 -0* cLI
.I(
U
0
0.
0
U 4 LiOr.
U... J~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~.
3
W. ( 10Z
.
- U,,.DuJ
a0
)0LW
(0e~4
0
W
u~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~u
*0
W
---L
U0
Z
4
00
2 Z IO 40 -0 0
4
0
-W WZ 1
*
-U
1
-C.
00
0.)N u 0
4
2
001 Z
0(U
1Of 00
0
O W
.. 0
@0.
. 400
-.*
I0
Z;3 -0
0.
*U.
0-C
W 00.0 *0
1..(. 1C,
*0o W
.03
0.
Z
1- Z -0-0
LI
@0 02
-U"0.
b2-U01
L
2
0.03. UJ
L(U.L
.)
-+.0,Z> 1Z
04(I3
C
~
St.
LI
W-LU
.
LI
LIL00.J.I
0II 01 A LW
*U.U.)-~~~~~ 0
a
'LUI'. I.C
aC
0U2
w0.0
00
00
U.
* *
0
00C
@4
0
02
atV
*0.
.2.
3
40 U.
a.d
0
0
I
I'
..
J
IOU.
I
10. (I-C
20
0301 U.(~~
1ZW00.
I-I
?3
0'
t
LI
(.1
N~~~
a
LI4
. '
0.0.U.I'-1-NWN1-
0U.L.-.I
0
0 * WU.LO.40
0O'OU*00UUJL 0
~
.iuII
3-I
.:1
k
0
0.
.1
U.
Z
CL
I
4
30
(
W
.
0~~~~
L
Z*
4
0
%
.
4
0
W
.
.L
0.-C)
000U,
0
%~~01:U,
.4LI
0.
C
02
I.2
-.
-0IL
0-I
LWI
(0
-
(0u
IC
-I
U.
2I
0. VPI)1-
Ul
U
.J
4
1U X-0..0
0~a
4
X
10A
U)
o2-
I.
.
)
-
2
ZMo
0(20
1
3.
U).I
l
I
.
*C)0.
.1
1
(
.
4J
--
100.
1-w3
Z0
-
(0
5A.I-.-IZZ
-..
0
v
Z
(
.44U U
I
N
..
U.K
-
X aC.
. 2 U.
-- 2
03T
0L1
0 T.U.
I
U
2
1
JIZ
Z
0
-
,
-XI
.~~~~
0
*W1-
.
(LW
0. C)2
O--3
C
Ul
w
U
I
(0
32
-2
.
_
-
0.U.
C.-0..L U. X2.I
!1
02
wj0
-
..
U,
.
U
x
(0 30
1U
.2 UZr £J7
U
00.0.22 -w
LI
U.'
£
.Z
-.
0.0i
4U.
Z
U).
102
.J
~~0wW~~a.~
~
..
.
U.
0
2
U0U. L,-.0
U.
WO
124
0I
1'I
0
~~0U
22-L-1
"IL
. X
U U -J
-
~
Z
IC110
-
24
2.0-L
0
00
-
IU....I
U
'0
lU.
.03ZJ
0-
0
U
3
(
-
-55-
cr it
0~~~~~~~~~~~~~~~~~~~~~~~~0
U,~~~~~~~~~~~~~~~~~~~~~N
J4 NJ
Z
LD
U
.~~~~~~~~~~~~~~~~~~~~~~C)
0J
r
Q
Z
0
I
wuC.
--
~ ~~ .
ZIa:
t -uu W4
o
X
2
0.
a
-z'
0.
0
4C
4
~ ~~~~~~
~~
U,~
-
o
-
0-.
U
W
a W
240D~
0
Wu
2
4-
4
Z
0.4.4
4L44-iA
· ZL
.40.4-4
Z
43 4rU3
OLU.~0
NJ.
y Il r
W4r-*20LILI
IL
2
NJ*41.
X
WZ
*2WO
~
~~
4
4-4
z4
y0.
4-4
0.
IA
22
..
d4IA
4Wu
4-N443
*
4N4
-s
0.
IA
U
DL
N44.4.INJL~P -I
4rrV 4
4.JN
I
*0.NJIA&
4a
4
4-
~
~ ~ ~ ~
o
.42
LI
0.
4~O~
.43044
L
-
~
.~
W·
C.)
O4
U,
NI
&
U
CC)2
2
-
W
O4
W
*t
*
·9~~~~~~~~~~~~~~~
~~~~~~~~
*~~~
~~~~~~~
a
N
N
*
4do£
A
.
0.
*0
*
0U
*N
*2D
N
N
·
K*
*
*J*
*
*2
*
4..
24-0
0.~
'3
I
N
0
4
ON
0.
4~
0
U
N
0
2
.
4.3
4-U.,
w
2
4-
A
.U
U,
4-
WJ
NI
N 3034
*WWO
*
4
LI
~ ~~~
0
2.
3*4
~
-ILL2
0
W
.-0
N
'1
0.
4
3
0N-.0
I
0004.O
Ij44
I.S....NN
W4JJ0JL
LU
0.A444433402
'4
0
0
4
4JN
.4
40.
c
0.2.4
U.4W~~~0
X 2
4ZN0IX
J. LI LI
U
0.
.4
3
C) 1.4
4-L
(J4~~~~~~~~~~~~~~~~~~~~~~~~~~N
00.
U
0
4 0
N
LI
0
LI
U
NJ·
NJ
0W
I'd
Z
~
4~
W4-
W00
.
3
N
0
.00
-42
I.3
0..4.
ILJ,
NI-ID
20
044,
2
44
4-)
3U
IA
IA
IAIA
A
14--
4
J
3
4.
I'N
OA
.J.
NN
.
0
.01
..
000).
44
4
c
444
22W2'
4N
L.
0
U.UQQ
Z
4
O
0
-
4-
ON
..
.J334
U
40
V 4~
~
4IP
00
..
J-4w4
.0a-.4J.
4NN.0-I
4-4.22c422·4'2
*
U4c
0.004~~~~~~~~~~~~0
W
W
N
.44N
N
W.I
4NNu
N IL 0.
444
O0
W-2I
?Ir4
2.
0.
00
(3042
I
2041,041.
N
0J
00
a
44
--0IA
U00t
4
102
*
X2
2.1
UX0.U
t(-00Z 4
0ONL
WN
0424
4-
*
N-A
4.
44N
-N
0U0.
..-£3.
0
43 1
4
W0.0.IA0.2~~ZI
Y
4
4
0
*
VI
-
04
N4
0
O
I
N
4LZ
2-.-3
.4IA
.
..
JN
2TU
Q
L
4-
0
0
000
4-
0
0
Z22
*0.
42
.W~2N
*-4
.N
.
0
0.0-2~WU 1W4t4-
. . .. U.
0.
2
2
4
3
C N
4
4
0
02
4
N
J.
- LI N
NO~
41
4
0..0
0.
NJ
0.J
0
.0.
.
4.4
a
V443)
44
3I
0.0
UC
.J3N
4
0
.W
rr-c
l
Qx
~
-·2
c
(
~
·.
IA
-~
-0.2.0.41
44J
4-1c
0440.2c-'44c-4--4.441.4u~-NO
-33224---
£24U.0.JNJ..J0.3NOI
41
20
UZWUw.-
~
.40.
-23X
WLL2
0
4-
~
444-44A
-
.4
.4
4
4
WWWOW~
~
.4
I
~
~N
0.0.0..JE.~ 04-N4-40
WW414IW
0.0
42
N
CL -4
4LI
00.~~~~~~0
C
~~~~
P.U.
2
0.
0
*
W~.
-3
1.
0
0
*1.4
*43
W
LI
*
N
*A
N0
~
z
4-
0.44434W
0
2
4
y
2N0
0-wOO
U
r0ZO
Nw
40.IAO'0-4-40
0.20w4J444-4 WN
0..
4-
IA
WU
WO0r
.
W
00.
WU>
-J
00
C.-4
4.JNc
0.LI
0.-LI
rr
0
4
a.
·o
*
~
Z
me ocrY
-4
U,
U
~L~
wY'~
430
I4
OY00.-40
U.UU U.n
-
~~
0...J
r1J
N~YIA.
4-04
4440
nJ4(0.ONJ
e
U,
10
r
44
2
-04-N4222Y0
LU
N ..
JW~ N .4UUJLIJNJ-24
LI41.U.
4
U
N24.uw
N..UsCU·
z ar
Q~
NN
a
4
-~
-0.
*
04L
WY
00.4
Z
z 2t
4
c 0 2
r-·
*c
04w
-.
32
2
4
4
444 .4.
4-4-O
4...
N
N-I
2
..j
2
NJ4
0
c
-0U
24-U.-'04
LI4 41
ON c1r0
430
2
&
4-4..
N44
ir
NJW
.JNJI4- c
Z&ILU-I
.Z
20. *UJ1C
o
..;.
Z) W
W
2-
3-
cu 04-20.*O
~WNJ
44.
XX
'O
do-go
44
2.0NJ
0
0*r
2C
*
N-arm~ IL3Z
00N
W
CD
n.4co
.4
4- 02
CL
2-324rro2
2
U,
NJ
4W~~~ A
*
0
0
333
3
WW
W
44
-56-
Li
-L
(66
-L0
6/,
4i
IW
.6W
0
~~466L
C
0
W
6Z
CLK
8
I
366VI
VI
VI
1 I
0
Li30
666-NZN-
3
-1
*
-
6fl
3
0.
0
~ ~ ~ ~
0
uo
(6
6/14-
VIlNUO6O
*..J3
.JZ
*WU
~~t
360.610
£3
6
*·
4
3
6£
Li-L
Li
6-
I
01/14
6-
66
0
66V
0.6
0
II
Li
r1
1A~
a
*1..
IU.IV
TO
6
)
66L66.1.
0..U.....J66A00...
OO
ZI).1(~r~2
3..i6
w
·
a~LYvf
~~~~~~~~~~
~~(Z~~rLU~~~X
~
*~n~N.*3~~~1-CC*~
ru~~~~~~~~~~~~0d
6
P-
46-3
LiiI
-'0 u 0
01 -4iZ00)
0.10.061
0U00
er
Cn~Ii.
660000
(3-666666
66-0iLii3
ILC
L
0
33300
LiI
4
1
I3
Z
ZI
6U
461ii~~~
01
6-01.
66
6/6
0.46C
UV
U.
'0
KU.
0~
ZW
6
U.
0
661
666-UU 66
660.
0.ZO.VIVI.I
63(3…V
LIU~cU
U.3
301010~~C;Y)'1VQr-C
WO~
O
10
61-
116116
U.I
L
~
2I
0
VI
Z.01-3(
6U
.. 0..I..0Z
dZ~~~~~~~~~~~l~~V
Z
Z 01
0101
UO~~~~
6V
UILiOVI
.
6
W6--6-0.+c
0.6-66-6h-U
ZWXZZ0
0
1(I
z
66
I26
VI
6
666
Z.
-~Z
0
ua
-
22
6(
66
.J6J
~
'
4
"
uZ
01
0OZ(I0W.VI
LiU U
C1~6
.6(
~
(3
n~
66b 3
66
W
1
'
I
6
.01
Z
3U.U.6-U.U.16
x~~~~~
6-6J6
~
~
uu
40.ulru
:
0
666
V
-'
0.
0..
W
V
3
3
.o
0.
U.
X
.
-
xur
~
C)
J. J
44444-
iiD~
Z~~Z
Z
NNN411O
w
~I2:
IZ
,D(D.4I
0
021111
W.J
I
6-6---hj 66VI
66
6a
Z
.
6/6661
0
ocUUUd
IIZd
0CLLC
WI
0C'0.O9OOV
6641
0
*U64n
)( -.
66V
00000
VI.V16U-6
6KOVIV
2.
66~Y'-0.J
0.
10
6VIVIO.
W
Q.J.J~J.I
6 L
0W~I-66.
ZC
-rYIUYVALI
A6/ 1UUI
U
66yLi
66-
ry
0
11666666
O~~
ZC)
6:ZZZZ
IXC)
r
+cI
6
6/6PL
*?I
0
Z~~~~~~PZ~~~~4a
W061Li6-VI0.3
V
4 (
-
~
Li-0
Elii
0. p.666-6/1
0Z
_ 0
j 4:
-- 2X-.U04)
I.-W
0066066
6.04
4Li0
*4
Li
61
*K
10.01-0.**
JO..0.0.OUJ.300.
6 ...
34
3
13
.133331
B66U.1.16660.66.366I-U.6~03333
1 66
ZZi'
.CU.6-01W0166Zrl9
3
6 6
I.00
1(14.
KOU.6WO~
CtU.3LU.U66UVI1O4W0 66q~~
*4*4*
61
3
.?
0
*VI
0
666
66.
VI
Q404C
V
00&66..
6631/
Li.J
U. 06 6 66VI*U
0.
*?-0*N
1-
~U
I
6-
VI
Li
L
3
-
K3
0.6-L
fm
K
.6-I
363
00000
Z,
S d.
Li*
C
0
OC00
a01
.JU
~
66
66C
VII
£
w66!
4
6.4
J
Z
w
~
N
4JLIzi
I
GOQOVI
00
4
3
'~
-0
-o
.Wo u
.aZ
(0.40
66-Y
4nZE4w
66
6LI
LiU..)
y-44Z
U.2.W6IO
W
PPIC)L
0
, a '.·Lu?
*74-U.Li0
6-6/
NV
6.40K
X Id .
.J0W6-6--)CO
2
Li
0W4P/1Z0.30.LiVIU.?I.166-lJ
A
W
0
*
~
VI00
4
4
a
C
.
*u
66-f
Z40.4WXr0
wI
066'
~
~
~
W
W
-
Z C) (
I iL
C
Z-L
M- "L l
.
4WIL
-
4-a
X
c
u
Li
(3
.
00.0~~~~~.0I
Z
(3Li
46/U6W
J
-
oo
660
U~ )
66
6.
Li
Z~~~~~~~
.:
U.(f0
ZLUPIOWZ
w w
UJ-i0-VI
JI"
A
I ZI
O4UOW
'UZaW0.
3400o:,~Z
W
I
Lia'fl
0
(3
wU
lW
No . 0
Li~~~-i~
66
6IQC
Z
W
-rI
34
&
LiZ
2:2:(00
I~
IV
0~
VI VIQW~~^
VI~Cb
VI~IJ~~ur
re6.-U avxi
6/66/1~4Q4
~~*1U
LI(ls
Licr~
33
-57-
0
W
W
O~~~~ i.U
u~~~W
c
0
'
. Z
l
.
C
Z
i-)4, 3. 0
-YIL LY :C 'a
ZIL .
''
I ul .)-NL
,I 0.
LU·
W0
0UJ
ZI,-]L
I '-L
'"'1
I
OU
.0*
II,..
4*o
o o
*Ll
Oc
*ZI43-
L
WIl
U
C rc~~~~~~~~~~~~~~~~~~~~~~a
O W
~~~~ ~ ~ ~ ~ ~ ~ '*~ ~ z,,o~~~~~~~~.
.4.~~~~~
~ OOILZ
..ZIu
-
.1
4a
a
U
.)NC
I
M..
rr
V
- .b
Z'L
WILL4
0
I~~~~~~~~~~~~~~V
aal
a,,
*
·
aaL
in~~~~~~~~~~i*
I0..Z
.
Z
3
W2
3
U
UjI
0~.
a~'-'":!
·
Q~~~~-0
6.
77Y
aL
ii
u
*
*ILI
*
.
'0
I
4
U
fl
~
0
-
44
7
I
n
-
70
3Z
N
0
-1
(ZU
.0L0
'
.J
-
IL
Ia
0
L
3.J
0
0
0
*0I
.0
I
I
nI
0
.
~ ~~~L
j
0
1"'~
7
7
I
0
.4
5
0
4
5-
*
..
03
.5
j
n.54
.
a.
4
L
0(
5-
0C
0~U
Inr
Z 5-v
IL
0.
r0..
t
c
5-
u. (I…
…
.3
-
0
1IL
I
IL.'4
..
…
…
3
ILO
0
p
1-4
-49
0
449I4
-
0000.10
I0LIILI5-n ~
1J,
k) 01UIWr4A..22
In)
ZO
*-L0
IL.
0.)
4
00I0.7
..
L.0
?
0
W
0....." '
-
.a*'
0SLl0I
.0.Z..
3
I
L
0
0
41W-
-L
IL-
0
£4£05I5
*t
7
I
.
ZI
9
0
I
J
~I
~~~~~~~~~~
0
~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~M
~~~~~~~~~
. . .00
~
~
'.
~-r
OA
~-IuI0
320').LIZ.010
Z2rr
I00
0-
7
4
94
.
04
i
1-11-0
I00.i0
.W
02.
£i2-5
L
IL
OZ
7
4
.
1-.
1-O
5-5
I
In
U4
Ida.
U
ILa
-
0
0
L
..
.
2
7
-ZS.)
.1s
IL
.e.J
5U N
-~~~~~~~~~~~~~~~~~~~V
4
-
0
- 5I
4
).5-IL
0 I
I
41 IL
ur~
ILJ.*-Q7L
IL
0C
*.4..
kL
N 5-3 =)~~~~~
0
Z 0..
-..ILOIL
2.
3t
441-.4
WO
0-OI
4.a.(,)
0
lr
54
OI6.1~~~~~~~~~
0.
0.
a4
acn
IL *-
94
*0.
44145- 1- 0.
II0.
-
u
IL
L
5
*
' .
SO"'
IL3
*
*
a
0
7.
£45
5-DI
)..
°°
O
0
2
'5
n
4
W
5
L
ua
0
0
0.0W~
1IL W3
.
S
)( In
co
IL.
13. I,-
*4.
-J5-
~~~~~~ 0~
~
.j.N um I
-.
*l
aN ILC
a1-
Ia
-
Il
Z
LI
X0.w
.1
0
I04
LI
I 0
0ZC~~~~~~~~~~~~~~~~~~~~~~~~~
0
r5-5-
IL
* *
-
ILZ£J.-.
-
0.
0n
17I
47L2.
Ll
IL
7~L
ILJ
ou.0
a.
a
00
03
10
c
ZII0
.10
0
00
I
1.9Wuh
12
-
OS.404
0
400
'·
rc
0
L
0
-58-
=o
0
LU U
v,I~~~~~~~~~~~~~~~~~~~~~~~~~~~U
ru u~~~~~~~~~~~~~~
i l
W
4
I
U.
a,
I
.
3Z
<
0o Nt'9~
zU,
u
W
0.a~
.4
D Z LUU0,
I. 4
W
WJ
.-Z_
.
Wc
-.
c
U,
LU
U
H
- ! 0
0.c
SO
U.
2.
0.
.NU.
L
VU
3 MN)*
w
w0. WO
0.0.0
.
U0-X
*
H-30.wUZ4
-.
044
0.1
·
*4
W W 7- 3
W
0
0O
W
=w
U
*r
40-0.4.
U
0.
0.644
3
Z
.
Z
I.
~
"Cc
ofI.M0.Hcc
4 '
__3
LUZO0
*** 3.4N
L4
4.U.4U.S-U
0.0
.4.
-
*
-04.W40.LILL..-W
-C0L.H-.40uZ
*
30-wOO
I~---?-N
*
0.H-J
L~~
-
: U,~
.1
-
L C.,
L(
o
'4'
0. Z
0.
H
U'
*HH
4'N4L
-z
N'J0.3
HNN
4H
4'
*
00.110.-I
-0.0U4
-.
C43
* 0
HU
4-0-0(
.0.>w0
-NZO
*
uNLU>U
*-N
4QU>
*U0.
N9
U-4d0U~O0.UN-4
2
SO
-l.4
30CL.0-4-0.0
).>U
*0
0.4-L
0.
-0.0
.40.Z44.U40.U
041 0.
LUUJ4WO
0.0-0..
~
C IL W
*
WX
LULUW
U,
Z
WM
0.ULUuZ
C
>
.
W
-.
JI
"I
0-
IW
"I
0
4LU
Z UU
1VI
W
0.
*
0
U
-.
*
H-
av
UI 3
Z6
CI64UOZ
-1
4.4
~ ..i
3
0
-
Sr
-W0
H-U£
4
-4U
L
z
CJ.)if
ILZO4'
Z...U
..
.
LUU0.
*HHUNUN
Z
0-3
U
H-HI
.
*6N0
2
.>. UN.
*
Ua4
4
ZW 00 WU-WM 0W-6WM0
IMMM4
0.
LJ
*
O'4.0
LU
17J.
64
030.)..
4
.4
H-.
3
0.646-Z
.4
cwZt4
30Z0.N.
HI
WU.'4
0*
0.-40
WiA
-.
I. ..H0-LU04
4
LU
UUO40.
W0.
He
Z
U0
NuN
4.4/4
HN(Z
02
5lr-40
4
H0.U.IJI4
ULL)
IN&.L
IL~LU440
UZ
*UH-20. NZ
0 4-0
U
0
LU UN AZ
. O4-m S 4W~3~)
X
0*Z.uU.44
0.0U
OUU
HW-4110(~r
~- ~
.
-
~~~~~~~~~~~~~~~~~~~~~~~~~~~L0.
03W*U
W..O
3 ZiO
0
I-N~I0-Z--H-U4..LUU4N.C-4--.-.1
0. 5.
U.4ULNH-U
U,
OI.~
L.C1 u..--0'-l
II~
W-40.-Z--ZJ0.
Z
-0.30.0
Luf)
0.
.4 0H-HH-0.
U,4
*4UU4U
H-HZZ4N
.
4
34/,H-44UH-HU
Z4-4-Z"I
N
N-k
I
U 0
4/X
.4:
04<0ZC H- II
-0
W
LU
0W0.
0H
0.
UNr
0Q
4444444
Ua HHLU
LU
NC
0at
'N.
LU
0.
'N
NLU
- 30
0H-LU0.
U,40(0 U,40
U,
2
0 0.
00N
N
40S
U.
W
0.
U
LU
LUI
U
UN
4
44.000
LI1
2.
4
QD XZU
Z
ZLU
IL
ZU
N.U2.UNv
N
H-H-U
.4
Nr
3
CI N
0u
000 U
U
0.
0.
LU
---N
N
e
~
rr-
44z
2.
0.
-J
LU
H--4
4
·0.
-3
H0
U
M
N
·
N
--
vI
-
4.
4
* r
rc
IU,44W0
0
H'000
0
.
N N N~
N
000
0.
H
-WYr
4
0Nf~t~JIT~t~t~t_"
*
00
0.teOa-OU.
3OO0
-t4tJt
-UU~n~0O0OQ00
O HOI*
NNN
I
*
.IOW-Q.
44X
0 .0 .'Z
L*
0-4Hvu,4o
NtU
0 (
N N L N
I L .43N-444.ZZ
0(0(0( Z
Zk0(4..J44Z
NNNU
Z
U
.4
0U,...JJ..J .J.44Z444WN.W
r~rr
4-H4-NH-
Z
4
o7
W
LU0
LU0-0.
0
3
-
00.H-u..ZLU0.
2
-)
I0C
3U
NN4
H-
U
H, LU
N
4. I00
.4
UZ
40
..
J
0
4.
O
N
.
0.
N
"Ir
HHUH-4
UUN0.2 L
LULUWI-WOD
NHI
UN3
ox
XM"Z0
4
30.4
0.4
cnL
U,.-U,
, NH -I
-·
UJ.4IO.4-IOU
H-4.L
OHH-Z
0
10g-14
N Z-M7UN
44,
N4
4
4
Z 40Nu.
00-0.
0.·-·Z0.IU
H-34..NN-H0
H
NC
LU
0
LU
LUH-0
44--L
34
4Z
JI
H--..LU.
4.0.
0.
0
I
.
0
4
40~~~~~~~~~~~~~
-Z
WZVI'
44
- ,
OU.40 U,
0-U,
.)
I
HLU
4.
---
HZ
0
0
M I~~r~ZZLZZ~~Q~
N
0~~~~~~~~~~~~~~~~~'.I.
0 !
U
-J
0.I
40.
0.C
H
'N
4L
UW
--
I
I.
0.
UN+
U
IL
2~~~~~~~~~1
0,0.
U
LU
0r
N
IL
I4
U,
U
-
-
1C.0H
34.
3-4.Q
U3U
H-IUI
L
0-0.0 -U~
W
Ca4
..-.
44
4
LU
.4
H-a
H
I0
H
H0.
IL
U)
LU
U
0.
U
0.
C
3
4
3"
NLUI
N~4
NH-HU,
0
44Z
N
-·UN.
- N N·
0.00
0
NC
LU
0.
HLU
0.
H0N
0.
H-
.C
LU
44-H-·-4
LNO
U
Iu~
U
0
0U
U
Z
LUJL
0
0.LU
Z
40.
I
O Z
W
4 H-LU'NA
HUNNC1'N.
OU .W4.
034.0.
4
0'
N·..
ON(NH
CUU
H-U
0-0
Z
LU
0
-4
N4 UU
Nr N
UOO
'00
U--'- I
Nm
*UU
4N
NC(
xH-
-X
U-
'
~
0NH-0.0.0.~(
L
H H4
U
-
3
H-H-.4I0.
.N
· N
·
-~ N
0.
010
N0.
IL0.
0-4H-LUC0
0000NNNI
NH-Z
N
4-NH111*
LUN
NH N N
4
I
WL-·-ruSN · AL
U,
LUL
N
0.3
…
…
H-L~~(NUYr
ON
*
42.4
4
V*)UU.
'UOa
ZNNNNNNN0ZZHZL U2.-H-LU
..
40.LC-Z
…
0.0.---ar~
3-'
H -.
40.·- 440
0.0
4~w-LU.-IH-.-4N444
-V
U
.4.H
0.044H-Q
N.NNCW
NUUU0.0.0.--U
H-I -4.
H-OH~r-0.~
W4/N r-NH,-LU0.4UUU0.X0
0r
K.
-
U UUUUUUU
-00000000
4N-ttt
-6
00
000000
VV
U
UJt~t-t
.)-r·t
Nt
OO
O
OQ
OOUOOUU
O
a
LU
HL
O
-
24
Z
ONH-4.
IUNWLU0.--r
02.
04cu
0-0
33
34W
CJUN
U,Z
WU,
W
N4
.
0.I
W
-.
W
0.
4.LU2
4. 0-U
0N
0-.
CN
U2.
U
M
XUN
0.
0.
H-.4.
A
UNLU0.40
M.
0.
0)-o.'
LU
0
0tN.V
-Z
4
U"~.~NZz
~~..
0
0r 0 C
L
M
U
W
W
ILO0Z-~.u-
0
Lu
4
..
~~~~~
U 3; MLU.
-2
/
0. Z
-u
0.2M.W.
4.04.
0
-J
U
I
W0.0
U
I0
w0.
U C,
UUU
..
Ny
U
LUW3
II
.4.
.4
30.
4WI
M- -J44U
M2M.
U-
n WLWUU...4U-.u.0.--z..
:WU U
LU
* .0.
WW
IZZ ZZZU.
U
W
U0.U,
"J
W
IC"I
c, I, .4I
M
Z
2
LU
W
U.4
OW )NM 0.
00.U.
..
>U,
,
00.0c
WZLUO
Q
14.H
H-2.U
0 (
- -U,
L N-2
IA0.
UCL UW
39
CZ 0
UU
HX 'a' I4
LU
Os
34
*m
N-)U
..
NU
Llu.IJJ
~U, . 0 ~rL U. ....
~ ~ ~ ~
U
LU
1
U
3-
UI .
Z
H-N'
.
H-4
W0WU
.4
0.
N-0.U
I w
*-N
6-4
H-..4--U
.40.0
I
LU -0(0.0.0
4W
U'J-..I---..4>0
0-)-0. -000(30.
N0.4 .0.
2.~N
c14.U
U
A
0W4U,
WJL
4a-J a. 0.0.0.-4
0Ul
a.
..
*.4.U
* LUIS
00.0
-OZ
.0(00/
rLU
or
*
iN
UO
0.4
Wu.
*2.
0
63
Z.
WO
I,
4-0.
U
w
03
WI
WLUN
~~~~~
~~~~~~~~
0
W
- '
144
J.H-NUZ
S
L44104.
40.1 3
0
WUwZ
1I L" U,
ZULUM4'
O.
W
Cc
rq~~q
*
WH-44N2.44
v a
0
0.LcUH-O-4
1^
S
~
.
U,"IN
W.IL.0L
LUCZ W
UN
3~
C)
U
H-4
UW
40
LU
I
LU0.
*
4
I4 IU".,
It,
MW-ZO
a
S
U
4:
CZWLUN(H-
0.
o
U.
U
I
UW
U,
U
LU
-0
WI kOUNO
0-'I
0.0
0
-LU\L0.
U
NLUN
N0.0U
N
0(0.0ZN
NHU,
N402.
040.cc 0.4LU0-af~r
00
U
N
40-
I -J
LU
LuZ4Z
*
0U
402.4
Z0.H-.
HH-NC
4/00
4t
W
0.7
s--
.0
.
W
LIU
L
J
.U0
w~
4
0MXX
U
U,
N0-0
*U0.0
ZLU
0.0.4
II
r
2
30
FLU
LUU,
0-101.4
.L0.H
NUN
4
U
NZL.
400
H(
2
-.4.
4
100.
4-4 '0t~-YN-N-0
0--
0-p
H
-59-
Z
~ ~ ~~~~~~~~~
LIa.i
CZ3I
IIL)
U-
LI
U
a
M
LI
(JNW
N4
08
U.
InI
0
3
0
8.2
U,
U-.-3
D
..
ZU
-L
Inf0ZZ
0
.
iii
I'coIn
-
4
aZ
WV
-c
IU,
I
Z
4L
UiZ;
8W
11-
InMZ W
I
In
8.~~~~~~~~~~I
!-1
0.J
2-8.
w.3
30
-
N.N11~~~~~~
N-?
1-t
....8'
3120
W
8.,
4I
4
1
W
2~
Wd8.0
I
j
at
·
°~~~~~~~~~~~~~~~o
0
~~~~
*
b
JU
I
:EA.
Z
Z~~~~
OIL9 3,- J 0.X
..
. 0t,.
I"O
Uo
w
o*
-w04
'o-
,.'
.-.-
8.-- 7'
0
N-
-
I
N4wUJO
,
ILN-D
WZ
J810LZ
W·
e~l
""'
2"
I-
11
-0
O
CZ *I
3In
2
48.1N.W
4
U,
W
.
-<2
4..~
I,
Q
1-,-,,.1
A.11
x
*,"I Z
4
:.2L
0
-.
. LI
821-InLI
I -08In
-4
to- - 43.,
8.
w I.N
)
o~~~: a.
3
WI
w
nnnI
2
NiMi..J
4
cc- X,
LI*
1
C3i.
4
IA
2
-
. .LI .~.~t~
NWU
42;8
444
U..8.
N-InN
.0-.42
8I
111.8
M..
~ .0i~1 . II
I?
LI
3.I.NQL)8
8. 8.l a.
J
-
LU
s
70
.
,,.
1-8.
.
-
LI
--
8J
. U.
LU
-.
-W
I 133
13
N
-
I.%
~~~~~~~~~~~~~~~~~~~8--tt'"
*In8.222LU
-.-
*8
-
I
a
1-4
2t
N0nnn8
L
'
U~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~,~
C}t.
L,,
8
0
4i nZ4441-NLtIw .I-I-IP.
I
InN->
.N
W
.e...
WWL
4
W
I~II
NW0
.8
LIIL
L.n~ U.
8..I
W-"4~
.8 .
n8I.
,
-4
-or.
,=
W04
8.In
8.
.
8.
I
~ ~ ~ ~ ~ ~ 0
~
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Z
0
U,~~ ~~ ~ ~
2
,it.,
NI
-4I
~n.W...iIL
0
0
ON
IN
LI)8.
Il
N
.
2
LI
0LW
1
8.
4
U
X30
4
X?2W-U,
W
I4
4LI
I..3-.8.2U.O
4W
·
8.
...
cc~.~~~ ~~~~L
w
LUNWWLILIIJ
2
710
C.)S. or=u3
11
.1 M.--.8.8.O.
.8
LI'4NLU1I8.i
N4
In
-
0
1
4
103
L
-,
.
.,X.
3W
8 .
0-1
a wax
U,
W?
8.
8.
W
N ~
IL
..-- ..eO,
0W<ou~)-2
1W1.000
Z4Ii~~~8.W~~O
4240 2
0
-
)
W3
-- r..0
00
L)L
-
~
.41
0
II-
J. LU4 W 8.NiNONNi
o
-
2
0
.
In
-
N-
8W
U,
N
In
i
.-
.
1.-.
-
4
-.
UU
J
'A-.
z3
U.
U..j
-
U,
33
1-8
P-3I1I3
N
In333
011
8.
~~~~~~~~~~~~~~~~~~~~~~~~~~~,U
Z Z~~~~~~~~~~
w
8.41-
41
Z
J
8
48.WW441-W0U~~1-I-.~
X
-
U.
LI
U..
I-
02
4"
.0
Z
U- 1(
".
2
N
0
.08. 8.Ni4UJW1-
Z 1
-
"'(3M.
U.. LI
Z
8..
U
'
1
-
.
I-1-1-I-~1C
-.-I~4I-8.1-1-~
1-U-
1
.1.
2-
0
-U,
:
,-
)-o
X Z-
40
U. .,
-60 -
4..
* tr'
:
*
5*'
, ,.
lL
O)
1
C
*
*
G
0J
W6
4W
*
*0o
*0 O
*Ub
5..
*0
J~
-,
43Cy0
I_
A
LL'M.~ -. *
P2 U
O-
Z-0*.nL
,
-
*
W
5 5*
*
3W
~
*
V.
Z0-'- - Z
v....
X~~~~~~~~~~~~~~~i
z
,I,,
-
.4-J
-
..
O
)34664..
W
0 Sr -0-0 *434.C
*
*
5
*
wd
ae
0-iW
C4W
o-
*0
w
.4.0
*
*
0-(D
S
.L
. 4 .
5
J
*
AW0-). ,,
*! W egt
c |
Z
A
WW-
*
a0
*Z
? kga
. w cJ
02-
0-
:
..
.
0- _,._.
C
NN 4.
N
_ _
-N41"
-
n4
O
.4¢
U.
0
4(MO O
0
'.4
0)
*O
2'
-_: 43
C
NN
e
. 0X*.u.
N
_ _
..
_
0-0
-N N
NZ
N)
''
44_ v-.
Cl
- -6 ''
^
zz<
0-0
*5
X
-
N N
N
*4x
.441J_
_ U. NsS
45- .
-
'-4-. J
*
* .w
-_4
-
)~
,,3
00
Z
*
*
0-
4.50-4J0-
*A
.3_
.0 _
.e .
n eK
II
*
00
O-1O
^-)
)
54^-"W
+
^ o-.4
U.U._ 4-:.-Y '.
_0
0- 0- F2
0--0
."..W -0-0-0-
JW
0 -0. .
4
2B ***I)4
---
U. U.
4X
J
2%*. )U2v
O 40
--
N
U.LtW W
ZZ
WW
4
2>
--
0 -0- _
2i_
L
.)0- 22
44_ W
5 :.ZWU.4-U.U
W
Z
W6
1
0
a
-
0-4
u4 cf
·
_.
_4-
-
^ ^
- -
_
U.
C
4.4C
+
.6
W4
-.......
V 4*
*
4S-*4G4
4+,
J,
.*
_.
W
Id
.U,
*4
0-.
0
* -O
-
*
C-
4.I,'l="4
7
WXXs0Z
U> Ws w
Ci
JU.
Z
I _ _,%IL
.Z.).
^
_
4
- 4N 4.-W
Id
Vt
*
5
C
--40-e1^
0-0^
00f +XFO.O~C
-sVoul
_N *t'N *00*0---'
4 M
WM
4-44c-40 .
.N I
g
051-0^fi<c
I5-4 W_,Uu
Id.'N4-4
-_
*0 v
4
4-
5
5
*
0*
-
.^^0-
Z
…4
0
x
w.fiz
C6
1 C4,
k
0-
of W~ ~~~~~~
4.4
.I
U
'
*
.4<t
.
41
4.J
Z
U.
-
0.
a.
4
-
*
_U *
0-
.W0-?
e *...
-
*,
G.
.Z-
c4
G.0
tJ -
O0 _-
.
-3o4
-W
w,,
V* L
- Z4
Ofl444
- - Ww W
U.F
5v4N
w 4-AAA
0-
*
0
W
2-
.6
0.4
O
W
.w,
*
4
*I,, 44-.JId~
W_
w c
0-W..
,L
4
3ZIIcZ
Z24
*
1 .FJ O
C
0-..)
4a
'.4
r
W
wC4 J
CF-W0.
..0-0-1
IL. 04
0r c0-00*
-W
(
C
c
1,X
<..)
O
c. *
0W.4)
_
0-V0
.
WI.
0 0- *02
0, Lr
_L
'o*t
.
0 C)
oI) . 4 UI
W C~
*
5*
1.
3 *4.Id.^
-N)
C J*
_
I_3
W0
U.Id)3.S0.I
IOL
0.
44 c f
FC40C C
WI.
-
.C.Z
<.
CWZ
I
SC.-C
. U
0
G
.
^ L4
44 ---
_4-0
*
*
*
*W_
*0
5.
%. A
4
-.
,
Q
' t-,**
W Cg
~ 4 .6
*~~
.624
or~
4
'.)
*
W
*549
)-.3*-I
^-30..5W
;W-W
Id4..
*
_
~' .
W
:
W4
:
Z
Z
0
- a,
*
-
*
4 o
0.
CC
(.45-
z
*
*
t-
C- I
, *0-.*
*
n4*
0 4
^N: *
* *
r
WC.)
} _- t4 .
*L
*
.
W
W6
cZ
uJ
Z U, ,. ,cJ
*
-O'
~~ ~ ~ ~-03. ~ ~ ~ ~ ~ ~ ~
~~~ ~~~~~~~~~~~~~~~~~~~~L
*
C W
im
of
G,
.
p.~~~~4
p.
4
t'
,l
*
.
Ct
-.
Z O
WU.
CZ
0W
i
0: W
41
4O
*
~.
.4
Z
U.
_
4
W6
*O
5'.tzV 4S
*
*
:0
-
0
*
*0
I
"'~~t
* .- 4
U,
c
50
*
*
* 5.
*4 _
4t
-40
O *0 0 -04-0_. vcu f
_
4-- .. 0._
4 _- 9 It 4.)
..11
5
S
000
§|"+C
-- 44^4^----0000
1J44L4o
L*sv-0-0.--00I4-.0
f
osf
~
5
-4
0-<
.:.
Oc
2.
W
0
W2ZU<
'.
QZ
2.6.6
BIBLIOGRAPHY
1.'
The American University, Machine Indexing: Problems and
Progress (Papers presented at the Third Institute on Informatic
Storage and Retrieval, February 13, 17, 1961), Washington,
D. C., 1962, 3541.
2.
Baxendale, P. B. "An Empirical Model for Computer Indexing,"
in Machine Indexing, American U., 1962, pp. 207-218.
3.
Bohnert, L. M., "New Role of Machines in Document Retrieval:
Definitions and Scope," in Machine Indexing, American U., 1962,
pp. 8-21.
4.
Luhn, H. P. (ed), Automation and Scientific Communication,
Short Papers, Part 1, American Documentation Institute,
Washington, D. C., 1963, pp. 1-128.
5.
Luhn, H. P., "Keyword-In-Context Index for Technical Literature
(KWIC Index)" in Amer. Documentation 11, 1960, pp. 288-295.
6.
Maron, M. E., "Automatic Indexing: An Experimental Inquiry,"
Report No. P-2181, 1 September 1960 (rev. 2 Feb. 1961), p.31.
7.
McCormick, E. M., "Why Computers?" in Machine Indexing,
American U., 1962, pp. 220-232.
8.
Overhage, C., Harnman, R. J. (ed). INTREX: Planning Conference,
M.I.T. Press, Cambridge, Mass., 1965.
9.
"Project INTREX: Semiannual Report," Massachusetts Institute
of Technology, 15 Sept. 1967.
10. "Project INTREX: Semiannual Report," Massachusetts Institute
of Technology, 15 March 1968.
11.
Salton, G. (ed), "Information Storage and Retrieval," Scientific
Rept. No. ISR-11, Department of Computer Science, Cornell
University, Ithaca, New York, June, 1966.
12.
Stevens, Mary E., "Automatic Indexing" A State-of-the-Art RePort," U. S. Department of Commerce, National Bureau of
Standards, NBS Monograph 91, 30 March 1965.
-61 -
Download