How are What, Who, Where, When and How Tractable?
Anna Maria Di Sciullo
0. Introduction
This paper develop the basic ideas proposed in Di Sciullo (1999a) with respect to the
formatization of context that relies on word-internal asymmetrical relations. We provide evidence that
morpho-conceptual complexity, the fact that morpheme are associated with more than one categorial
and semantic set of features, is tractable in terms of contextual information based on what we define as
local asymmetrical relations. We concentrate here on the properties of wh-morphemes. That is
morphemes that requires some yet unspecified contextual information to be tractable. We provide
evidence that the morpho-conceptual structure of these expressions provides the formal context
enabling world-context tractability.
One consequence of our proposal is that morpho-conceptual parsing can be improved. We
present the main features of a morpho-conceptual parser for the analysis of closed questions. The
prototype incorporates a grammar based on asymmetrical relations and a Unification grammar
controlled by an LALR algorithm. We show that the accuracy of morpho-conceptual parsing is
augmented with asymmetry-based grammar, where the notion of context is implemented in the
operations and the constraints. We predict that the inclusion of such parsers in information extraction
systems will lead to their optimalization.
The first part of the paper presents the consequences of our formal notion of context for the
recovery of covert dimensions of context, given our integrated theory of language knowledge and
language use. The second part of the paper presents the main features of a morpho-conceptual parser
incorporating the asymmetry-based grammar and formal context for the analysis of wh-expressions.
The last section considers the consequences of the inclusion of the morpho-conceptual parser in
information processing systems.
1. Context
We assume, as in Di Sciullo (1999a) that the categorial and conceptual information (F)
spelled-out by the morphemes of natural languages is interpreted in a grammatical context (Context G),
and define it as follows :
(1) ContextG of F
The context of interpretation of a feature F is a function of the formal context of F.
According to the definition in (1), the categorial and conceptual features associated by the
morphemes of natural languages are not fully specified and projected from the lexicon but they are
constructed in the derivation of the expressions they are a part of. Evidence that this is the case comes
from the fact that the position of singular morphemes in complex words determines their F features.
Thus, a sub-set of morphemes in English may either be spelled-out at the left or the right edge of a
word, as discussed in Di Sciullo (1999a,b). We concentrate here on cases where the position of the
morpheme is constant and a sub-set of its features are covertly specified in its local formal context.
This situation is illustrated in (2) with wh-expressions.
(2) a. Who invented chemistry?
b. What does the proton do?
c. When did the eclipse happen?
d. Where did the meteorite fall?
We argue the morpho-conceptual structure of the wh-words in (2) locally provides the covert
conceptual features making the expression tractable without the morphological spell-out of these
features. In the case at hand, these features are the [person] feature in (2a) primary actor of the event
denoted by the expression it is a part of, the [activity] feature in (2b), the [event time] feature in (2c)
and the [locus of the event] feature in (2d). We show that formal context is the basis upon which
predictions can be made with respect to the full conceptual features recoverability in the world-context.
2. Grammar and parsing
We assume, as in Di Sciullo (1999a) that singular grammars are the instantiation of Universal
Grammar (UG) parameters in terms of differences in the strength of morpho-functional features
(Chomsky, 1998, and related works) and that parsing singular languages is performed by the
processing of specific UG parameters. We posit that as UG is designed to optimally derive linguistic
expressions in terms of asymmetrical relations, Universal Parser (UP) is designed to optimally process
natural language asymmetries. Were asymmetrical relation is defined as follows:
Asymmetrical Relation (x, y)
x and y are in asymmetrical relation, iff there is a formal relation r that is true for the pair (x,
y) and false for the pair (y, x).
We present the main features of a prototype implementing the asymmetry-based grammar
for the analysis of morpho-conceptual structure. We will refer to this prototype as CONCE-MORPHOPARSE. We show that it correctly recovers the covert conceptual features supported by whexpressions. These features, even of not supported by overt morpheme are tractable in the formal
context and traceable in the world-context.
The morphological parsing is performed by a Unification grammar incorporating a LA LR(1)
control structure. The prototype builds morphological trees for wh-expression (W), providing a
structure to the morphemes (Operator (O), modifier (m), variable (v) and restrictor (r) they are
composed of. We provide evidence form Germanic and Romance languages that a sub-set of whexpressions include a modification structure, supporting the spatio-temporal features of the event or the
situation denoted by the expression they are apart of. We also show that wh-words are not fully
specified with respect to their conceptual structure, whereas they are with respect to the formal context
in which they can be tractable.
3. Consequences for information processing
In this section, we present a view of information extraction that relies on asymmetry-based
natural language processing.
According to our proposal, the semantic information conveyed by the lexical items, in
particular wh-words, is restricted by the asymmetrical relations given by UG. If the morpho-conceptual
information attached to documents is a function of the information conveyed by the lexical items and
their composition in the formal context of the expressions contained in the documents. We outline the
main features of an Information Processing system incorporating morpho-conceptual parsing.
Assuming that information processing systems are series of texts (documents and queries) filters. The
first filter submits a text to morphological and lexical analysis, assigning tags, including morphoconceptual feature structures to words. In this system the dictionary plays an important part in tagging.
In particular, the lexical entries for derivational affixes, as well as functional categories, carry
important information about the syntactic and the semantic structure of the linguistic expressions.
The initial tagging is performed on the basis of a limited dictionary consisting of function
words. The functions words are specified in the dictionary in our features under asymmetrical relations
format. Thus, wh-words, as they are functional categories, are part of this lexicon. Each wh-word is
associated with an operator variable structure with a partly restricted variable structure and possibly
with a modification structure.
who : DP(x, wh), OP(x, human)
why : PP (x, wh), OP(M, purpose (x, event))
A comparison of the text words against the dictionary is performed, a sentence at the time, by
a sequential merging process. As a result of the look-up process, any word found in the text will have
received one or more tags. The majority of content words not listed in the dictionary are tagged using
morphological information about the wh-words, such as who, which can be interpreted as the origin of
the event, the entity undergoing a change of position or a change of state or as the terminus of the
event. Disambiguation is performed by analyzing local morpho-syntactic context using a bottom-up
chart parser and a Unification grammar, thereby CONCE-MORPHO-PARSE, to form constituents and
to compositionally derive the meaning of constituents from the meaning of their parts.
As the recovery of covert morpho-conceptual features in formal context contributes to the
spelling-out of world context, the integration of CONCE-MORPHO-PARSE in Information
Processing systems contributes to their optimization.
