Ju.D.Apresjan, I.M.Boguslavsky, L.L.Iomdin, L.G.Mitjushin, Institute for {nformation Russian Academy Automatic V.Z.Sannikov, Transmission of Sciences, Dictionaries in A.V.Lazursky, L.L.Tsinman Problems Moscow the ETAP-3 System From: AAAI Technical Report SS-93-02. Compilation copyright © 1993, AAAI (www.aaai.org). All rights reserved. |. ETAP-3 in its is an present processor }t are natural language joined the the two so that Linguistic transfer rules, is (apart not not the {n any fact, translation proper language In 4-5 complexity for into processing queries {0-{5 and translated comparable system to (up performance does system can accessed completely in either knowledge language can be inde- easily en- syntactic and semantic require of Russian- linguistic the any and dictiona- re-adaptation dictionaries) algorithms knowledge may are when geared exclu- representation be being and technical other while the texts hard and direct {5 words The by quality a and human. back 99 the do machine MT system from natural (queries) one language from a natural (SQL). for for as texts of science respectively of that system l anguage and to conceived difference Query produced in of not The translates sentences). that be basic either. seconds length can pro- unconstrained and languages part another, Structured in (morphological, domain. main the to working options databases The two latter combinatory scientific with into the The and, linguistic computer. databases and linguistic basic systems, language interface from both [{-2]) processor’s formats itself re-adaptation translates natural of and of The database standardized object formats C. (see multipurpose VAX-6340 (MT). domain-specific new system (English-to-Russian The number a relational the knowledge a a and morphological, to require PL/| that highly MT into on with so the the transferred to are domain-oriented from sively evolved languages. are ETAP-2 translation working |nformation bidirectional former, larged. ries) used and representation pendent, has the implemented machine to of communication to-English) of is languages options General outgrowth state, [3]. gramming ETAP-3: texts a sentence queries of Here and machine is translation: VAX-B340 a takes of up to average 30 words translation sample of is the MT In recent years, processor government development and revolution in V nedavnie gody pravi razvi tie bovanija, chemical, vyzvali perevorot rudovani i In recent of microprocessors chemical, tel and a set system is in of the natural accounted cient MT the their it to for The three levels are morphological, the above, age interface lism is set with dependency In what tense, mous wordforms Syntactic and and of text repre- representation deep and in Mel’chuk’s a in This difference display (=norma- a transition there natural the level suffi- syntactic effect is a more language mismatch of the in and suc~ semant’ics, a which representation. deep level shared by syntactic. in all we concentrate the As system both is of systems clear natural the levels, apart are written in on representations are characteristics assigned alternative deep in from the from langu- the same morforma- trees. follows etc.) levels representation At obo- revolution from However, the of representations morphological aspect text databases. sentence of of of of overcome reach a semantic Morphological a to syntactic, there phological, so level faktorov development a languages mismatch queries additional caused databases. level language. to demands, of natural between necessary the with the that tre- i promyshlennom three levels the with drugix inspiration are that target counterparts, becomes acounts cope the its interface at a Background four fact difference SQL pair to source fundamental the forced instrumentation mnozhestvo factors There and similarity structure from system by linguistic lized) [4]. have equipment. takes language for other micro- texnicheskie technical Linguistic theory industrial i industrial ETAP-3 influences bioximicheskom of demands, postanovlenija, decrees, and Linguistically. sentation other and ximicheskom, government ) Text" of "stvennye 2. ~ host biochemical biochemical "Meaning a technical mikroprocessorov v years regulations, syntactic to each sets of the strings systems of only. lexeme (specifications such name. In case characteristics representations, i00 MT are or names, with of number, of homony- generated. structures, are linearly nodes 60 ordered and subordination relations specific apart dependency are aithough ning the in basic to the account process of adjacent 3. Basic for Russian-to-English) are ievels each cannot MT grammar be MT for alI, 40 to or language kept strictiy structures combi- avoided. System option (i.e. are hybrid of (in universal the transfer levels of arcs various AII of representations labeiIed Components components wordform for constructions.) eiements Two with reiations used syntactic trees, (EngIish-to-Russian sets of rules) and rules. They and dictiona- ries. The core wordform of filters and other tree (or a number the set of tic structure of tic structure which source the which phase are called number of motivated ration At the its accomplished stage all, the are of case the level of units of the obtained syntactic target language. elements, structure rules of MT system use makes dictionaries monolingual translation rules are are in during dependency At the the with of this to target applied next a semantically result the the structure synthesis of syntac- to ope- produce language. produce the orthography. Dictionaries combinatory the syntactic deep equipped including to syntactic is and (syntac- place, and structure depen- specific takes a homonymy) syntactic proper) (lexical units deep onto tree the language- of system syntactic into the pair correct resulting transformed is a sentence the of The whatever submitted nearly dictionaries is 4. and in map sophisticated extract hypotheses. conventional guages) two trees morphological in A characteristics, turn processed to (translation by the applied such specific morphological four on units lingual certain is replaced sentence which free expansion in last In is morphological fully target of phase language is the It in subtree. sentence) structural relations) the the transfer all is admissible language. parsing dependency devices of are occurring binary out that system representations (hypothetical) dency the the MT of (for four dictionaries: each of dictionaries (except in the addressed in i01 for entries both System the two for Russian the translation of both directions two mono- working lan- and English, zone dictionaries). of translation. and All While morphological gical analysis/synthesis extensive other for dictionaries only, information properties parsing tionary its (2) lexemes specify part the the subsequent entries (I) on of and contain of the combinatory semantic, to all provide every |n for morpholo- dictionaries syntactic, transfer. for information store co-occurrence the material, particular, .and necessary combinatory dic- keyword speech; simplest and most features of common translation into the target language; (3) syntactic precluding total its number English occurrence of and such around declare, report, "passto" which the passive made to marks data), its the (5) its potential direct NP to system such occur the to the is 150 verbs assigned govern reported for as the say, feature to-infinitive have it a year Apart a failed the form in or He was be the case given dictionary are expected with funny keyword for of 50 occurs must a the must be further for the which it form from a how its the may govern somebody ’person’ much and semantic from of 80; that be the example, ’money’. house entry rules in which attention for and or with a an fur- duration specification his neighbour for references to dollars. scope) rules form rented dictionary lexical dictionary entry a He in the over requirements indication it with colour), meet; that it long cf. price that, in that how to sign, indications semantic the picture, is further phrase phrase" ’time’; from (narrow so-called or ’action’ text, features lexical as pressure, with specification for (for semantic a counteragent "price such length, pattern, specification specific the such supplied semantic must for be object, semantic in of dependents will (desrciptors), velocity, number ’institution’, that to are ’information’ government further phrase (for actant rent ther is the example, etc. it constructions; in for ability features syntactic, to with syntactic feel He walk), total morphological, a their semantic ’parameter’ verb Russian; in allowing operational help, aa production, like: for make, voice, certain features 200 keyword stay; (4) work, in the will 102 which the are may contain should processed be sentence, word-specific. contain ac%ivated reference For and example, to the the the speci- fic transfer phrases 4 rule like to privlekat" manie ted a lation into so of in on. all way the a thousand) 4 whose from rules may will be one (whose to mentioned the draw in support the verbs in number all to of in the vni- dictionary are their the attention sosredotochivat’ indicated total pertain 4 context the translation vnimanie, attention rule in idiomatic udeljat" focus same nouns the transla- common system trans- goes non-morphological well phases of processing. The ary to The Such for attention different zone. sentence extent entries either of a of sophistication makes procedure al pay vnimanie, and entries providing of dictionaries, extensively ry. Besides for large virtually compiling corpus are it impossible combinatory examples or although used that, necessary to lexicographic to devise man-made corpus and the automatic produce prototypical compile dictionary from the to classes to any dictionautomatic entries directly monolingual or the dictionary such "human" bilingu- dictionaries combinatory entries from dictiona- can be facilitate the for inflected selected actual work’ of compilation. Even such as morphological Russian though of this would course work. entries helps in can uses system, allow a be special which the entries case helps to to the whole may compilation, based as a simple morphological system, a many dialogue Russian 250 ques- diction- short of as al- facilitate on paradigms contain languages substantially produce In verbs, automatic algorithm generating of highly fully devised semi-automatically. create which, hardly means ETAP-3 tion-answering ary dictionaries words wordforms. References I. Ju.D. ETAP-2. 2. Ju.D. lation 3. Apresjan (The al. Linguistics Apresjan System, et META, al. 1992, informacionnyx sistem. (A tion Moskva: systems.) |.A. Tekst. Moskva: Mel’chuk. (The Nauka, vol. The 37, system.) Linguistics 1, No linguistic pp. Moskva: of a Naula, Machine |g8g. Trans 97-112. processor processor sistemy for dlja slozhnyx advanced informa- |ggs. teorii of ETAP-2 obespechenie Lingvisticheskij Nauka, Opyt theory the ETAP-a: al. H Apresjan Lingvisticheskoe of et 4. Ju.D. et lingvisticheskix Meaning ~ 1974. 103 ) Text type modelej linguistic tipa "Smysl models.)