Functional Type Theory

advertisement
CHAPTER TWO: FUNCTIONAL TYPE THEORY
2.1. Compositionality
We have given a compositional semantics for predicate logic, a semantics in which
the meaning of any complex expression is a function of the meanings of its parts and
the way these meanings are put together. In predicate logic, complex expressions are
formulas, and their parts are formulas, terms (constants or variables), n-place
predicates, and the logical constants.
As is well known to anybody who has gone through lists of exercises of the form:
'translate into predicate logic', predicate logic does a much better job on representing
the meanings of various types of natural language sentences, than it does on
representing the meanings of their parts.
Take for example the following sentence:
(1)
Some old man and every boy kissed and hugged Mary.
With the one-place predicate constants MAN, OLD, BOY and the two-place predicate
constants KISSED and HUGGED, we can quite adequately represent (1) in predicate
logic as (1'):
(1') x[MAN(x)  OLD(x)  KISSED(x,m)  HUGGED(x,m)] 
y[BOY(y)  KISSED(y,m)  HUGGED(y,m)]
The problem is that without an explicit syntax and a compositional semantics,
representing (1) as (1') is a completely creative exercise, and it shouldn't be.
What we really need is a semantics that gives us the right semantic representation for
sentence (1) (and hence the right meaning), and that tells us at the same time how this
semantic representation is related to the meanings of the parts of (1).
Our representation (1') has various parts (like ) that don't correspond to any part of
the sentence; some parts of the representation occur twice in different forms, while in
the sentence there is only one part (kissed and hugged Mary); the representation
contains sentence conjunction , whereas the sentence contains NP conjunction and
verb conjunction; OLD is separated from MAN in the representation, which is
motivated by what the meaning should be, but nothing explains how it got separated,
etc. What we really need is a logical language in which we can represent the meaning
of sentence (1) and the meanings of its parts:
OLD/ MAN/ SOME/ OLD MAN/ SOME OLD MAN/ EVERY/ BOY/ EVERY BOY/
AND/ SOME OLD MAN AND EVERY BOY/ KISSED/ HUGGED/ AND/ KISSED
AND HUGGED/ MARY/ KISSED AND HUGGED MARY/ SOME OLD MAN
AND EVERY BOY KISSED AND HUGGED MARY
21
Similar arguments apply to the structure of noun phrases. We represent sentences (2)
and (3) in predicate logic as (2') and (3'):
(2)
Every boy walks.
(2')
x[BOY(x)  WALK(x)]
(3)
Some boy walks.
(3')
x[BOY(x)  WALK(x)]
In both these representations, the connective  or  is put in to get the meaning of
the sentence right, but there isn't a theory of where this comes from. And this shows
up when we start to think about other determiners:
(4)
Most boys walk.
(5)
Mx[BOY(x) ? WALK(x)]
Even if we were to add to predicate logic a new quantifier Mx and try to get its
semantics right, we couldn't get to the right meaning, because we can prove that not
only are both the truth functions  and  inadequate (whatever definition of Mx we
manage to provide), but much stronger, there is provably no truth function (and hence
no connective) that could provide the right meaning here. The meaning of most falls
outside the range of a logical theory of quantification with the power of predicate
logic.
This means that we have to study the meanings of determiners - of which every and
some are only two instances - much more systematically. And when we do that, we
see that there are all sorts of semantic inference patterns that we find across classes of
different determiners: certain determiners have, by their meaning, properties in
common.
For instance, we find the following entailment patterns:
In the examples in (6) we find an entailment from (6a) with walk to (6c) with move,
while in the examples in (7) we find the inverse entailment from (7a) with move to
(7c) with walk.
(6a)
(6b)
(6c)
Every/some/at least three/more than half of the boys walk.
Whoever walks, moves.
Hence, every/some/at least three/more than half of the boys move.
(7a)
(7b)
(7c)
No/not every/at most three/less than half of the boys move.
Whoever walks, moves.
Hence, no/not every/at most three/less than half of the boys walk.
These inferences are obviously related to the meanings of the determiners involved,
and in order to explain them, we need a theory of determiner meanings.
We are thus faced with the general problem of formulating a semantic theory in which
we can assign meanings to expressions of all sorts of linguistic categories, and in
which we can formulate semantic operations on such meanings, which will produce
the correct sentence meanings, predicting the correct inference patterns.
In order to get the general idea behind such a theory, let me digress once more a bit on
predicate logic. I claimed that we had successfully given a compositional semantics
for predicate logic. Yet, when we look at the semantics more carefully, we see that
22
there are certain parts of expressions of predicate logic that I have not specified an
interpretation for. in particular, I have not specified an interpretation for the logical
constants. that is, I have not specified v¬bM,g, vbM,g, vxbM,g, etc.
While I have indeed not specified such interpretations, the semantics is nevertheless a
fully specified compositional semantics, because for each of the logical constants, I
have completely and explicitly specified its semantic contribution.
For instance, the clause in the truth definition for ¬ is the following:
v¬φbM,g = 1 iff vφbM,g = 0; 0 otherwise.
This tells us that the semantic contribution of adding a negation to a sentence φ - any
sentence φ - is to reverse the truthvalue, i.e. it takes the truth value of φ in M relative
to g, and it specifies the opposite one as the truth value for ¬φ in M relativized to g.
This doesn't explicitly tell us what the interpretation of ¬ should be, but it does tell us
this implicitly:
The interpretation of ¬ in M relative to g is that entity that does the
above, i.e. that entity that has that semantic effect.
This entity takes a truth value (the truth value of the complement) as input, and
produces a truth value as output (the truth value of the whole sentence). Given that
there are assumed to be only two truth values 0 and 1, and that the semantic clauses
are formulated in such a way that ¬ has a definite semantic effect for every possible
input, there is only one entity that the interpretation of ¬ can be:
For every model M and assignment g: v¬bM,g = {<0,1>,<1,0>}
Negation is interpreted as the one place truth function which maps 0 onto 1 and 1 onto
0. A similar argument shows that  is interpreted as the two-place truth function
{<<1,1>,1>,<<1,0>,0>,<<0,1>,0>,<<0,0>,0>>}.
When we look at the semantics of the quantifiers, we see that in a fully explicit
compositional semantics for predicate logic, where indeed every part is assigned an
interpretation, we have to make all our interpretations a bit more complicated. I will
not go into that here (see, e.g. Landman 1991).
The point of the comparison is the following: we have natural interpretations for
certain of the parts of our natural language expressions. We need to find
interpretations for the other ones. Our strategy will typically be to first look and
determine the semantic contribution of these expressions (as we did for negation).
And then we look for the entity which has that semantic contribution.
For instance, suppose we have decided that the noun MAN is, as before a one place
predicate constant, interpreted as a one-place property. Now we are looking for the
interpretation of the adjective old. We observe that there is good reason to assume that
old man is like man a noun (but a complex noun), and that it is a one-place predicate
just like man. Let us introduce an expression in the logical language OLDn, which
will be the representation of the prenominal adjective old. Let us add a rule to the
logical language which says:
23
If P is a one-place predicate then OLDn(P) is a one-place predicate.
Now let us assume that we have another one-place predicate OLDp, which we do not
treat as the representation of the pre-nominal adjective old, but rather as the
representation of the predicative adjective. Then it is quite clear what the semantic
contribution of the prenominal adjective old, and hence of the expression OLDn, is.
We can formulate the following semantic rule:
vOLDn(P)bM,g = vOLDpbM,g  vPbM,g
This semantic rule completely specifies the semantics of the expression OLDn in a
functional way: it tells you what output it produces (vOLDpbM,g  vPbM,g), when you
specify any possible input (vPbM,g). The task of finding an interpretation for OLDn,
i.e. of defining vOLDnbM,g then comes down to finding the entity that does exactly
that. The entity that does this is the function which maps any set onto the intersection
of that set and vOLDpbM,g. This function is henceforth the interpretation of OLDn,
and hence we need to have in our models domains that contain such functions.
To give another example, let us look at (2):
(2)
Every boy walks.
Let us assume, as before, that BOY and WALK are one-place predicates, and let us
assume that the meaning of the sentence is adequately represented by the predicate
logical formula (2').
(2') x[BOY(x)  WALK(x)]
We are interested in the meanings of every and the meaning of the noun phrase every
boy. Let us for that reason add an expression to the language EVERY and a rule that
forms a term out of this expression and predicates that are used to represent nouns:
If Q is a one-place predicate then EVERY(Q) is a term.
The terms that we had before (constants and variables) were interpreted as
individuals. We cannot possibly do this for terms of the form EVERY(BOY),
because it is intuitively clear that whatever such a term stands for, it isn't an
individual. Yet, EVERY(BOY) has to combine with the verbal predicate WALK to
give a sentence.
Here we apply the basic trick of the functional theory. If EVERY(BOY) cannot be
the argument of WALK (because that would make its interpretation an individual), we
can assume that WALK is in fact the argument of EVERY(BOY). We add a rule for
predicates that represent verbs:
If P is a one-place predicate then EVERY(Q) (P) is a formula.
If we assume that the predicates P and Q are interpreted just as usual, then we know
what the semantics of the sentence EVERY(Q) (P) should be like:
vEVERY (Q) (P)bM,g = 1 iff for every d  vQbM,g: d  vPbM,g; 0 otherwise
24
This tells us again just as before implicitly what the semantic contribution of the term
EVERY(Q) is: The interpretation of EVERY(Q) takes a set as argument (vPbM,g) and
gives as result truth value 1 iff every d  vQbM,g is in that set (vPbM,g).
This means that vEVERY(Q)bM,g is the function that does exactly that. And there is
indeed a unique function f that precisely does that for any set. Once we know what
the interpretation of EVERY(Q) is, we can apply the same procedure to find the
interpretation of EVERY.
The semantic effect of EVERY is completely determined by the semantics for
EVERY(Q) (P). We know that EVERY combines with Q to form the term
EVERY(Q). We know the interpretation of Q: vQbM,g; we know the interpretation of
EVERY(Q): vEVERY(Q)bM,g = f, the function that we have just determined.
This means that the semantic effect of the interpretation of EVERY itself is to take a
set X
(X = vQbM,g) as input, and give as output the function fX. Hence, vEVERYbM,g is the
unique function g that takes any set X as input and gives the corresponding function
fX as output.
We see here that we need rather complicated domains in our models to find the right
entities for the interpretation of our expressions: a determiner like EVERY is
interpreted as a function which maps sets onto functions from sets into truth values.
While the function h which we have assigned as the interpretation of every adequately
captures the meaning of every, this is not visually clear, in fact, it is hard to see what it
does at all. This is where the logical language comes in. We will in the next chapter
add to our logical language the lambda-operator, which is an operator which allows us
to represent functions in our logical language. With that operator, we will represent
the meaning of every perspicuously as λQλP.x[Q(x)  P(x)]. We will show that
this expression is indeed interpreted precisely as the function that I did my best to
describe above, and we will discuss some properties of the logical language which
will make it very easy to see that this is indeed an adequate representation of the
meaning of every. Moreover, this semantics will be seen to extend effortlessly to
other determiners, which are not part of predicate logic.
It turns out that we can fruitfully apply this procedure for finding functions from the
semantic clauses to all parts of natural language that we are interested in. That is, we
can find adequate representations of the meanings of all parts of natural language
expressions we may want to determine along the above method in a theory which has
enough domains of different kinds of functions, and which is based on the two basic
principles of function-argument application (functional application) and function
definition (lambda abstraction). Such a theory we will now formulate.
25
2.2. The syntax of functional type logic.
We will start by specifying the language of functional type logic and its semantics.
For compactness sake, I will give the whole language, including the lambda-operator
and its semantics, but we will ignore that part till later.
The language of functional type logic gives us a rich theory of functions and
arguments. Our theory of functions is typed. This means that for each function in the
theory it is specified what its domain and range is, and functions are restricted to their
domain and range. This means that the theory of functions is a theory of extensional
functions, in the set-theoretic sense of the word:
a function f from set A into set B: f:A  B is a set of ordered pairs in A  B
such that: 1. for every a  A: there is a b  B such that <a,b>  f
2. for every a  A, b,b'  B: if <a,b>  f and <a,b'>  f then b=b'
The domain of f, dom(f) = {a  A: b  B: <a,b>  f}
The range of f, ran(f) = {b  B: a e A: <a,b>  f}
When we talk about, say, the function + of addition on the natural numbers as being
the same function as addition on the real numbers, we are using an intensional notion
of function: extensionally +:N x N  N and +:R x R  R are different functions,
because their domain and range are different and because they are different sets of
ordered pairs. They are the same function, intensionally, in that these functions have
the same relevant mathematical properties. Our theory of functions will be a theory
of extensional functions, and that means that each function is extensionally specified
as a set of ordered pairs with a fixed domain and fixed range.
This means that functions can only apply to arguments which are in their domain.
Partly as a means to avoid the set-theoretic paradoxes, we carry this over into our
logical language, which contains expressions denoting these functions. We assume
that each functional expression comes with an indication of its type, which in essence
is an indication of what the domain and range of its interpretation will be. The syntax
of our language will only allow a functional expression f of a certain type a to apply
to an argument of the type of expressions that denote entities in the domain of the
interpretation of f.
This is the basic idea of the functional language:
-each expression is of a certain type and will be interpreted into a certain semantic
domain corresponding to that type.
-a functional expressions of a certain type can only apply to expressions which denote
entities in the domain of the function which is the interpretation of that functional
expression and hence are of the type of the input for that function.
-The result of applying a functional expression to an expression of the correct input
type is an expression which is of the output type, and hence is interpreted as an entity
within the range of the interpretation of that functional expression.
We make this precise by defining the language of functional type logic TL.
We first define all the possible types that we will have expressions of in the language:
26
Let e and t be two symbols.
TYPETL is the smallest set such that:
1. e,t  TYPETL
2. if a,b  TYPETL then <a,b>  TYPETL
e and t are our basic types. They are the only types in the theory that will not be types
of functions.
-e (entity) is the type of expressions that denote individuals.
Thus e is the type of terms in predicate logic.
-t (truth value) is the type of expressions that denote truth values.
Thus t is the type of formulas in predicate logic.
-For any types a and b, <a,b> is the type of expressions that denote functions from
entities denoted by expressions of type a into entities denoted by expressions of type
b.
In short, <a,b> is the type of expressions that denote functions mapping a-entities onto
b-entities.
This definition of types gives us a rich set of types of expressions:
Since e and t are types, <e,t> is a type: the type of functional expressions denoting
functions that map individuals (the denotations of expressions of type e) onto truth
values (the denotations of expressions of type t, formulas). As we will see, this is the
type of properties (or sets) of individuals.
Assuming <e,t> to be the type of property-denoting expressions, then <<e,t>,<e,t>> is
the type of expressions mapping properties onto properties: the corresponding
functions take a function from individuals to truth values as input and deliver
similarly a function from individuals into truth values as output. As we will see, this
is the just the type for expressions like prenominal adjectives.
Since <e,t> is a type and t is a type, <<e,t>,t> is a type as well. This is the type of
expressions that denote functions that take themselves a function from individuals
into truth values as input and deliver a truth value as output. As we will see, this is
the right type for interpreting noun phrases like every boy.
Since <e,t> is a type and <<e,t>,t> is a type, also <<e,t>,<<e,t>,t>> is a type. The
type of expressions that denote functions which take functions from individuals into
truth values as input, and give noun phrase interpretations as output. This is the type
of determiners.
We really have a rich theory of types. Many types are included in the theory that we
may not have any use for. Like the type <<t,e>,<t,e>>: the type of expressions that
denote functions which map functions from truth values into individuals onto
functions from truth values into individuals. It is unlikely that we can argue plausibly
that there are natural language expressions that will be interpreted as functions of this
type. To have a general and simple definition of the types and our language, we will
nevertheless include these in our logical language (they just won't be used in given
interpretations to natural languages expressions).
We will very soon give some ways of thinking about the types described above in a
more intuitive or simple way. The above examples were only meant to show how the
definition of types works.
27
Based on this definition, we now define the logical language TL. The language TL
has the following logical constants: ¬,,,,,,=,(,),λ
As we can see, for ease we take over all the logical constants of predicate logic and
we add another symbol, λ, the λ-operator.
I will from now on in the definitions leave out subscript TL, indicating that a set is
particular to the language. I take that to be understood.
Constants and variables:
For each type a  TYPE, we specify a set CONa of non-logical constants of type a
and a set VARa of variables of type a:
As before, we can think about the non-logical constants of type a as the lexical items
whose semantics we do not further specify.
For every a  TYPE: CONa = {ca1,ca2,...}
a set of constants of type a (at most countably many)
For every a e TYPE: VARa = {xa1,xa2,...}
a set of variables of type a (countably many)
We will never write these type-indices explicitly in the formulas, rather we will make
typographical conventions about the types of our expressions.
For instance, I will typically use the following conventions:
c,j,m are constants of type e
x,y,z are variables of type e
φ,ψ are expressions of type t
P,Q are variables of type <e,t>
But generally I will indicate the typographical conventions where needed.
So we have as many constants as we want for each type (that is up to our choice), and
we have countably many variables of each type.
With this given we will specify the syntax of the language. We do that by specifying
for each type a  TYPE the set EXPa, the set of well formed expressions of type a.
For each a  TYPE, EXPa is the smallest set such that:
1. Constants and variables:
CONa  VARa  EXPa
Constants and variables of type a are expressions of type a.
2. Functional application:
If α  EXP<a,b> and β  EXPa then (α(β))  EXPb
3. Functional abstraction:
If x  VARa and β  EXPb then λxβ  EXP<a,b>
4. Connectives:
If φ,ψ  EXPt then ¬φ, (φ  ψ), (φ  ψ), (φ  ψ)  EXPt
5. Quantifiers:
If x  VARa and φ  EXPt then xφ, xφ  EXPt
6. Identity:
If α  EXPa and β  EXPa then (α=β)  EXPt
28
As said before, we will ignore clause 3 of this definition for the moment. Focussing
on clauses 4-6, we see that type logic contains predicate logic as part. Clause 4,
concerning the connectives is just the same clause that we had in predicate logic
(because FORM = EXPt).
Clause 5 for quantifiers looks just the same as the corresponding clause in predicate
logic, but note that in predicate logic we could only quantify over individual variables
(that is, variable of type e), while in type logic, we allow quantification over variables
of any type: in type logic we have higher order quantification.
Similarly, while in predicate logic we can form identity statements between terms
(where TERM = EXPe), in type logic we can form identity statements between any
two expressions of any type, as long as they have the same type. The resulting
identity statement is a formula, an expression of type t.
As an aside. In predicate logic, we only need the logical constants ¬ and ,
predication ( ,..., ), one quantifier  and identity, and we can define all the other
logical constants. In type logic we only need application ( ( )) and abstraction λ and
identity =, and we can define in type logic all the connectives and quantifiers we
want. This means that clauses 4 and 5 are really only for convenience.
Let us now focus on clause 2, the clause for functional application.
To see how this works let us specify some constants. Let us assume that we have the
following constants:
MARY  CONe
e is the type of individuals.
MAN, BOY  CON<e,t>
<e,t> is the type of one-place predicates of individuals.
KISSED, HUGGED  CON<e,<e,t>>
<e,<e,t>> is the type of two-place relations between individuals.
SOME, EVERY  CON<<e,t>,<<e,t>,t>>
<<e,t>,<<e,t>,t>> is the type of determiners.
OLD  CON<<e,t>,<e,t>>
<<e,t>,<e,t>> is the type of one-place predicate modifiers: these are functions that
take a property and yield a property.
AND1  CON<<<e,t>,t>,<<<e,t>,t>,<<e,t>,t>>>
<<e,t>,t> is the type of noun phrase meanings.
AND1 takes a noun phrase meaning, and another noun phrase meaning and produces a
noun phrase meaning.
AND2  CON<<e,<e,t>>,< <e,<e,t>>,<e,<e,t>>>>
Here AND is conjunction of two-place relations, it takes two two place relations, and
produces a two place relation.
29
Let us now do some functional applications:
HUGGED  CON<e,<e,t>>, a two-place predicate, hence (by clause 1):
HUGGED  EXP<e,<e,t>>
Similarly AND2  EXP<<e,<e,t>>,<<e,<e,t>>,<e,<e,t>>>>
We see that: AND2  EXP<<e,<e,t>>,<<e,<e,t>>,<e,<e,t>>>>
< ------,---------------->
a
b
HUGGED  EXP<e,<e,t>>
------a
Hence (AND2(HUGGED))  EXP<<e,<e,t>>,<e,<e,t>>>
---------------b
Next: (AND2(HUGGED))  EXP<<e,<e,t>>,<e,<e,t>>>
<-------,------>
a
b
KISSED  EXP<e,<e,t>>
------a
Hence ((AND2(HUGGED))(KISSED))  EXP<e,<e,t>>
------b
We see that ((AND2(HUGGED))(KISSED)) is itself a two-place predicate. This
makes it very suitable to serve as a representation for kissed and hugged (of course,
we still would need to give AND2 the correct interpretation).
((AND2(HUGGED))(KISSED))  EXP<e,<e,t>>
<-,--->
a b
MARY  EXPe
a
Hence (((AND2(HUGGED))(KISSED))(MARY))  EXP<e,t>
---b
(((AND2(HUGGED))(KISSED))(MARY)) is a one-place predicate, again, that is
what we would expect kissed and hugged Mary to be.
OLD  EXP<<e,t>,<e,t>>
MAN  EXP<e,t>
Hence (OLD(MAN))  EXP<e,t>
SOME  EXP<<e,t>,<<e,t>,t>>
(OLD(MAN))  EXP<e,t>
Hence (SOME((OLD(MAN))))  EXP<<e,t>,t>
30
EVERY  EXP<<e,t>,<<e,t>,t>>>
BOY  EXP<e,t>
Hence (EVERY(BOY))  EXP<<e,t>,t>
AND1  EXP<<<e,t>,t>,<<<e,t>,t>,<<e,t>,t>>>
(EVERY(BOY))  EXP<<e,t>,t>
Hence (AND1((EVERY(BOY))))  EXP<<<e,t>,t>,<<e,t>,t>>
(AND1((EVERY(BOY))))  EXP<<<e,t>,t>,<<e,t>,t>>
(SOME((OLD(MAN))))  EXP<<e,t>,t>
Hence ((AND1((EVERY(BOY))))((SOME((OLD(MAN))))))  EXP<<e,t>,t>
Assuming that <<e,t>,t> is the correct type for noun phrase interpretations, we have
see that all of (SOME((OLD(MAN)))) (some old man), (EVERY(BOY)) (every boy)
and
((AND1((EVERY(BOY))))((SOME((OLD(MAN)))))) (some old man and every boy)
are of that type.
Finally:
((AND1((EVERY(BOY))))((SOME((OLD(MAN))))))  EXP<<e,t>,t>
(((AND2(HUGGED))(KISSED))(MARY))  EXP<e,t>
Hence
(((AND1((EVERY(BOY))))((SOME((OLD(MAN))))))
((((AND2(HUGGED))(KISSED))(MARY))))  EXPt
We can make this more perspicuous by introducing a notation convention:
Notation convention: Infix notation:
Let R be an expression of type <a,<a,c>>, α and β expression of type a. Then
((R(α))(β)) is an expression of type c.
We define: (β R α) := ((R(α))(β)
:= means 'is by definition'. Thus (β R α) is notation that we add to type theory: it is
not defined as part of the syntax of the language, but is just a way in which we make
our expressions more readable.
With infix notation, we can write the above expression with infix notation applied to
both AND1 and to AND2 as:
((EVERY(BOY)) AND1 (SOME((OLD(MAN))))) ((KISSED AND2
HUGGED)(MARY)))
This is a wellformed expression of type logic of type t, a sentence, and it can be used
as the representation of sentence (1). Of course, we haven't yet interpreted the
expressions occurring in this expression, but we can note that, unlike the predicate
logical expression, in the above expression we can find a wellformed sub-expression
corresponding to each of the parts of sentence (1).
(1)
Some old man and every boy kissed and hugged Mary.
31
2.3. The semantics of functional type logic.
Let us now move to the semantics of functional type logic.
As before, we start with specifying the models for our type logical language TL. I
will follow the practice here to include in the definition of the models only those
aspects that vary from model to model.
A model for functional type logic is a pair:
M = <D,F>, where:
1. D, the domain of individuals, is a non-empty set.
2. F is the interpretation function for the non-logical constants.
This looks just like predicate logic. However, since we possibly have many more
types of constants, it is in the specification of the interpretation function F that
differences will come in.
In the model we only specify domain D, the domain of individuals. this is obviously
not the only domain in which expressions are interpreted. We don't put the other
domains in the definition for the model, because, once D is specified, all other
domains are predictable and are given in a domain definition:
Let M = <D,F> be a model for TL. For any type a  TYPETL, we define Da, the
domain of type a:
1. De = D
2. Dt = {0,1}
3. D<a,b> = (Da  Db)
So, the domain of type e, De is the domain of individuals given in the model M.
The domain of type t, Dt, is the set of truth values {0,1}.
For any sets A,B, (A  B) = {f: f is a function from A into B}
Thus (A  B) is the set of all functions with domain A and range included in B.
Hence D<a,b> is the set of all functions with domain Da - the set of entities in the
domain of a - and with their range included in Db - the set of entities in the domain of
b. In other words, D<a,b> is the set of all functions from a-entities into b-entities.
Thus, D<e,t> = (De  Dt) = (D  {0,1})
The set of all functions from individuals into truth values.
D<e,<e,t>> = (De  (De  Dt)) = (D  (D  {0,1}))
The set of all functions that map each individual into a function from individuals into
truth values.
D<<e,t>,t> = ((De  Dt)  Dt) = ((D  {0,1})  {0,1})
The set of all functions that map functions from individuals into truth values onto
truth values.
32
D<<e,t>,<<e,t>,t>> = ((De  Dt)  ((De  Dt)  Dt)) =
((D  {0,1})  ((D  {0,1})  {0,1}))
The set of all functions that map functions from individuals into truth values onto
functions that map functions from individuals into truth values onto truth values.
D<<e,t>,<e,t>> = ((De  Dt)  (De  Dt)) =
((D  {0,1})  (D  {0,1})
The set of all functions that map each function from individuals into truth values onto
a function from individuals onto truth values.
D<t,t> = (Dt  Dt) = ({0,1}  {0,1})
The set of all functions from truth values into truth values (= the set of all one place
truth functions).
D<t,<t,t>> = (Dt  (Dt  Dt)) = ({0,1}  ({0,1}  {0,1}))
The set of all functions from truth values into one-place truth functions (=, as we will
see, the set of all two-place truth functions).
The domain definition specifies a semantic domain for every possible type in TYPE.
With that, we can specify the interpretation function F for the non-logical constants of
the model:
A model for functional type logic TL is a pair M = <D,F>,
where D is a non-empty set and:
for every a  TYPETL: F:CONa  Da
For every type a, F maps any constant c  CONa onto an entity F(c)  Da. So F
interprets a non-logical constant of type a as an entity in the domain of a.
Let M = <D,F> be a model for TL.
An assignment function (on M) is any function g such that:
for every a  TYPETL for every x  VARa: g(x)  Da
Thus an assignment function maps every variable of every type into an entity of the
domain corresponding to that type.
Hence, g maps variable x  VARe onto an individual g(x) in D, it maps predicate
variable
P  VAR<e,t> onto some function g(P) in (D  {0,1}), etc.
We can now give the semantics for our functional type logical language, by
specifying, as before vαbM,g, the interpretation of expression α in model M, relative to
assignment function g:
33
The semantic interpretation: vαbM,g
For any a,b  TYPETL:
1. Constants and variables.
If c  CONa, then vcbM,g = F(c)
If x  VARa, then vxbM,g = g(x)
2. Functional application.
If α  EXP<a,b> and β  EXPa then:
v(α(β))bM,g = vαbM,g(vβbM,g)
3. Functional abstraction.
If x  VARa and β  EXPb then:
vλxβbM,g = h,
where h is the unique function in (Da  Db) such that:
for every d  Da: h(d) = vβbM,gxd
4. Connectives.
If φ,ψ  EXPt, then:
v¬φbM,g = ¬(vφbM,g)
v(φ  ψ)bM,g = (<vφbM,g,vψbM,g>)
v(φ  ψ)bM,g = (<vφbM,g,vψbM,g>)
v(φ  ψb]M,g = (<vφbM,g,vψbM,g>)
Here on the right hand side we find the following truth functions:
¬:{0,1}  {0,1}; ,, are functions from {0,1} x {0,1} into {0,1}, given by:
¬ = {<0,1>,<1,0>}
 = {<<1,1>,1>,<<1,0>,0>,<<0,1>,0>,<<0,0>,0>}
 = {<<1,1>,1>,<<1,0>,1>,<<0,1>,1>,<<0,0>,0>}
 = {<<1,1>,1>,<<1,0>,0>,<<0,1>,1>,<<0,0>,1>}
5. Quantifiers.
If x  VARa and φ  EXPt then:
vxφbM,g = 1 iff for every d  Da: vφbM,gxd = 1; 0 otherwise
vxφbM,g = 1 iff for some d  Da: vφbM,gxd = 1; 0 otherwise
6. Identity.
If α, β  EXPa then:
v(α=β)bM,g = 1 iff vαbM,g = vβbM,g
A sentence is a formula without free variables.
Let X be a set of sentences and φ a sentence.
vφbM = 1, φ is true in M, iff for every g: vφbM,g = 1
vφbM = 0, φ is false in M, iff for every g: vφbM,g = 0
X ╞ φ, X entails φ, iff for every model M:
if for every δ  X: vδbM = 1 then vφbM = 1
34
2.4. Boolean algebras.
Take set {a,b,c}. Its powerset pow({a,b,c} is
{ Ø, {a}, {b}, {c}, {a,b}, {a,c}, {b,c}, {a,b,c} }
The following picture shows how these sets are related to each other by the subset
relation :
{a,b,c}
{a,b} {a,c} {b,c}
{a}
{b}
{c}
Ø
The line going up from {a} to {a,b} in the picture means that {a} is a subset of {a,b}.
The lines are understood as transitive: there is also a line going up from {a} to
{a,b,c}, and, though we don't draw it, we take it to be understood that the order is
reflexitive: every element has an (invisible) line going to itself.
It can easily be checked that the set pow({a,bc}) is closed under the standard set
theoretic operations of complementation , union , and intersection ,
meaning: for every X,Y  pow({a,b,c}): XY, XY, XY  pow({a,b,c}).
The structure: <pow({a,b,c}), , , , , Ø, {a,b,c}> is a Boolean algebra.
A powerset Boolean algebra is a structure <pow(A), , , , , Ø, A> where:
A is a set,  is the subset relation on pow(A) and , ,  are the standard set
theoretic operations on pow(A).
In general, Boolean algebras are structures that model what could be called
naturalistic typed extensional part-of structures.
Think of me and my parts. Different things can be part of me in different senses. For
instance, I can see myself as the whole of my body parts, but also as the whole of my
achievements, and in many other ways.
-Typing restricts the notion of part-of to one of those senses. Thus, the domain in
which I am the whole of my body parts will not include my achievements as part of
me, even though, in another sense these are part me.
-Extensionality means that in the domain chosen I am not more and not less than the
whole of my parts. This means that if I choose the domain in which I am the whole of
my body parts, then I cannot express in that domain that really and truly I am more
than the sum of my parts. If I want to express the latter, I must do that by relating
the whole of my body parts in the body part domain to me as an entity in a different
domain in which I cannot be divided into body parts. This is how intensionality
comes in: I can be at the same time indivisible me and the sum of my body parts,
because I occur in different domains (or, alternatively, there is a cross-domain identity
relation which identifies objects in different domains as me).
-Naturalistic means that each typed extensional part-of notion is naturalistic: in that
structure parts work the way parts naturalistically work.
The claim is, then, that Boolean algebras formalize this notion.
I define a successive series of notions:
35
A partial order is a pair B = <B,v>, where B is a non-empty set and v, the part-of
relation, is a reflexive, transitive, antisymmetric relation on B.
Reflexive: for every b  B: b v b
Transitive: for every a,b,c  B: if a v b and b v c then a v c
Antisymmetric: for every a,b  B: if a v b and b v a then a = b
In a partial order we define notions of join, t, and meet, u. In a structure of parts,
we think of join as the sum of the parts, and of meet as their overlap.
Let B = <B,v> be a partial order and a,b  B:
The join of a and b, (a t b), is the unique element of B such that:
1. a v (a t b) and b v (a t b).
2. For every c  B: if a v c and b v c then (a t b) v c.
The meet of a and b, (a u b), is the unique element of B such that:
1. (a u b) v a and (a u b) v b.
2. For every c  B: if c v a and c v b then c v (a u b)
The idea is:
-To find the join of a and b, you look up in the part-of structure at the set of all
elements of which both a and b are part: {c  B: a v c and b v c}. The join of a and
b is the minimal element in that set.
-To find the meet of a and b, you look down in the part-of structure at the set of all
elements which are part both of a and of b: {c  B: c v a and c v b}. The meet of a
and b is the maximal element in that set.
Elements need not have a join or a meet in a partial order. A partial order in which
any two elements do have a both a join and a meet is called a lattice:
A lattice is a partial order B = <B,v> such that:
for every a,b  B: (a t b), (a u b)  B.
Not lattices:
I a
b
0
II 1
a
III 1
b
c
d
a
b
0
In I a and b do not have a join. In II a and b do not have a meet.
In III, the set {c  B: a v c and b v c} = {c,d,1}. This set doesn't have a minimum,
hence a and b don't have a join. Similarly, c and d don't have a meet.
36
Lattices:
IV o
o
Vo
o o
o
o
o
o
o
VI o
o
o
o
o
o
o
VII
o
o
o
o
o o
o o
o
o
A partial order B = <B,v> is bounded iff B has a maximum 1 and a minimum 0 in
B.
1 is a maximum in B iff 1  B and for every b  B: b v 1
0 is a minimum in B iff 0  B and for every b  B: 0 v b
Obviously, then, a bounded lattice is a lattice which is bounded.
Fact: Every finite lattice is bounded.
Proof Sketch : If B = {b1,...,bn} then 1 = (b1 t (b2 t (...t (bn1 t bn)...) and
0 = (b1 u (b2 u (...u (bn1 u bn)...)
Not every lattice is bounded, though. If we take Z, the integers, ...,1,0,1,... with the normal partial
order ≤ and we define: (n u m) = the smallest one of n and m
(n t m) = the biggest one of n and m
then we get a lattice, but not one that has a minimum or a maximum.
0 is part of everything. Conceptually this means that 0 doesn't count as a naturalistic
part. Nevertheless, it is very convenient to admit 0 to the structure, because it
simplifies proofs of structural insights immensely. But indeed we do not count 0 as a
naturalistic part, because we define:
a and b have no part in common iff (a u b)=0
We assume as minimal constraints on naturalistic extensional typed part-of structures
that they be lattices with a minimum 0. And this means that we assume that in a partof structure we can always carve out the sum of two parts; of overlapping parts, we
can carve out the part in which they overlap; and for non-overlapping parts we
conveniently set their overlap to 0.
Now we come to the naturalistic constraints. They concern what happens to the parts
of an object a when we cut it. When I say cut it, I mean cut it. In a naturalistic partof structure, cutting should be real cutting, the way my daughter does it. It turns out
that real cutting imposes requirements on part-of structures that not all part of
structures allowed so far can live up to. If in those structures cutting cannot mean real
cutting, they cannot be regarded as naturalistic structures.
37
Naturalistic constraint on cutting 1:
Suppose we have object d and part c of d:
d
c
Now suppose we cut through d:
d
a
b
This means that d = (a t b) and (a u b) = 0.
If we cut in this way, where is c going to sit?
And the intuitive answer is obvious: either c is part of a, and c goes to the a-side, or
c is part of b, and c goes to the b-side, or c overlaps both a and b and hence c itself
gets cut into a part a1 of a and a part b1 or b, and this means that c = a1 t b1 and
a1 u b1 = 0.
Structures that don't satisfy this naturalistic cutting intuition miss something crucial
about what parts are and how you cut. We define:
Lattice B = <B,v> is distributive iff for every a,b,c  B:
If c v (a t b) then either c v a or c v b
or for some a1 v a and some b1 v b: c = (a1 t b1)
Lattices IV and V are not distributive. IV is called the diamond, V is called the
pentagon.
The diamond is not distributive:
1
a
c
b
0
1 = (a t b), but c is not part of a, nor of b, nor do a and b have parts a1 and b1
respectively, such that c is the sum of a1 and b1.
38
The pentagon is not distributive:
1
c
b
a
0
1 = (a t b) but c is not part of a, not of b, and not the sum of any part of a and any
part of b.
There is a very insightful theorem about distributivity, which says that in essence, the
pentagon and the diamond are the only possible sources of non-distributivity:
Theorem: A lattice is distributive iff you cannot find either the pentagon or the
diamond as a substructure.
Standardly, distributivity is defined by one of the following two alternative
definitions:
(1) Lattice B = <B,v> is distributive iff
for all a,b,c  B: (a u (b t c)) = ((a u b) t (a u c))
(2) Lattice B = <B,v> is distributive iff
for all a,b,c  B: (a t (b u c)) = ((a t b) u (a t c))
I gave the above definition because its conceptual naturalness is clearer. But it is not
hard to prove that either definition (1) or (2) is equivalent to the one I gave.
We come to the second constraint on cutting:
Naturalistic constraint on parts and cutting 2:
We go back to d and c:
d
c
The second naturalistic intuition is very simple. It is that c can indeed be cut out of d,
and that cutting it out means what it intuitively does: when you cut c out of d, you
remove from the partial order all parts of d that have a part in common with c. This
means that you will be left only with parts of d that didn't have a part in common with
c. The naturalistic constraint is that what you are left with is a part e of d such that
c t e = d and c u d = 1, and its parts.
What this says is that the way my daughter cuts is like this:
c
e
39
The constraint, thus, regulates in cutting the relation between what you cut out and
what is left. What is left, when we cut out c, we call the complement of c.
We define:
Bounded lattice B = <B,v> is complemented iff
for every b  B: there is a c  B such that (b t c) = 1 and (b u c) = 0.
In non-distributive bounded lattices elements can have more than one complement.
For instance, in the diamond, both a and b are complement of c.
In distributive bounded lattices each element has at most one complement. Thus
each element has 0 or 1 complement. This means that in a complemented distributive
lattice each element has a unique complement:
B = <B,v> is a Boolean lattice iff B is a complemented distributive lattice.
There are two ways in which a bounded distributive lattice can fail to be a Boolean
lattice.
1. The structure so to say doesn't have enough elements.
Example: Let A be an infinite set and let FIN(A) = {X  A: X is finite}, the set of all
finite subsets of A. Let FIN+ = FIN(A)  {A}. Then <FIN+(A),> is a bounded
distributive lattice. It is a lattice because the intersection and union of finite sets is
again a finite set, it has minimum Ø and maximum A, and it satisfies distributivity
because  and  are distributive. But the lattice is not complemented: for instance, if
a  A, {a}  FIN+(A), and the set of parts of A that don't overlap {a} is
{X  A: a  X}. But these parts are not part of a part of A in FIN+(A):
{X  A: a  X} is an infinite proper subset of A, and hence not in FIN+(A).
2. The structure so to say has too much elements.
This is more serious.
Another example of a bounded distributive lattice which is not complemented is
structure VI:
1=d
o
a
o
o
c
e
f
0
Look at c. If we cut out all the elements that have a part in common with c, we are
left with e and its parts {0, f, e}, the set of elements that have no part in common with
c. But something got mysteriously lost here, because c t e  d.
40
We can formulate the problem with this structure in a different way.
Let B = <B, v>be a bounded distributive lattice.
B is witnessed iff the proper part relation is witnessed in B.
The proper part relation is witnessed in B = <B, v> iff
for every a,b  B: if a  0 and a v b and a  b (a is a proper part of b, but not
0) then there is an c  B such that c  0 and c v b (and c  b) and (a u b) = 0.
This means that if a is a proper part of b, and you cut a out naturalistically, with its
parts, there should be something left that didn't overlap with a.
Intuitively this means that you can only be a proper part of b, if there exists another
proper part of b to witness your properness.
This is such a natural constraint on cutting, that structures that don't satisfy it seem
indeed quite unnatural. Like structure VI: element c is a proper part of element a, but
c has no other proper parts (except irrelevant 0). In VI, the only complemented
elements are the four corner pieces: 0 and 1 are each other's complement, and d and e
are too. None of the other elements has a complement.
The naturalistic constraints on cutting tell us that we want to restrict naturalistic
structures minimally to witnessed distributive lattices.
This means that we exclude structures like VI. This leaves in still unnaturalistic
structures like FIN+.
To clarify the notion between witnessed distributive lattices and Boolean lattices we
generalize the binary notions of join and meet:
Let B = <B,v> be a partial order and X  B.
The join of X in B is the unique element tX of B such that:
1. for every x  X: x v tX
2. for every c  B: if for every x  X: x v c then tX v c
The meet of X in B is the unique element uX of B such that:
1. for every x  X: uX v x
2. for every c  B: if for every x  X: c v x then c v uX
These notions generalize binary (and hence finite) join and meet, because we can
define the binary notions (a t b) as t{a,b} and (a u b) as u{a,b}.
With the generalized notion of join, we can formulate the following fact:
Fact: Let B = <B,v> be a witnessed distributive lattice.
B is a Boolean lattice iff for every c  B: t{b  B: (b u c) = 0}  B
Thus, in a Boolean lattice, t{b  B: (b u c) = 0} is the complement of c, we call it
c: the unique element B such that (c t c) = 1 and (c u c) = 0.
41
We see that the notion of complementation can require certain infinite joins (and
certain infinite meets) to exist in a structure. This does not require all infinite joins
and all infinite meets to exist. A structure with the latter requirement we call
complete:
Lattice B = <B,v> is complete iff for every X  B: tX, uX  B.
In fact, you get complete lattices with much less requirements:
Theorem: Let B = <B,v> be a partial order in which every subset X has a join tX. Then B is a
complete bounded lattice.
Proof:
-Let B be such a partial order. Take any set X. Take {c  B: for every x  X: c v x}.
This is itself a subset of B, hence it has a join in B: t{c  B: for every x  X: c v x}  B.
But, of course, that is exactly uX, hence uX  B. So B is a complete lattice.
-B  B, hence tB  B. Obviously, tB is a maximum in B.
-Ø  B, hence tØ  B. With a bit of precision you can show that tØ is a minimum in B.
Facts: Every finite lattice is complete.
Every powerset lattice is complete.
We have already indicated why the first fact holds above: for finite lattices tX and
uX can be defined in terms of the two place join and meet.
A powerset lattice is complete because its domain is the set of all subsets of some set
A, and for any set X  pow(A), X and X are subsets of A, hence in pow(A).
Not every lattice is complete. FIN+, for one thing is not. Not even every Boolean lattice is complete.
Let A be an infinite set.
FIN(A) is the set of all finite subsets of A.
COFIN(A) is the set of all subsets whose settheoretic complement in A is finite:
COFIN(A) = {X  A: AX is finite}
(So if A = N, the set of natural numbers, N{1,2,3} is cofinite, but E, the set of even numbers is not.).
Now look at <FIN(A)  COFIN(A), >. This is a bounded distributive lattice with minimum Ø and
maximum A, and it is complemented as well:
for every X  A: (X) = A  X.
Now, if X  A is finite, then A  X is cofinite and if X  A is cofinite, A  X is finite. This means that
FIN(A)  COFIN(A) is a Boolean lattice.
But it is not complete, for the same reason as before: if A = N, then
{{n}: n is even}  (FIN(A)  COFIN(A)). But {{n}: n is even} is neither finite, nor cofinite.
While the theory of Boolean lattices doesn't make completeness a defining
characteristic of the structures, it seems to me that in as much as we want as
naturalistic structures complemented distributive lattices rather than merely witnessed
distributive lattices, we want as naturalistic structures complete Boolean lattices,
rather than just Boolean lattices. It is easy to see that:
Fact: Let B = <B,v> be a complete lattice.
B is a Boolean lattice iff B is a witnessed distributive lattice.
This means that if we take our part-of structures to be partial orders closed under
generalized t, we only need to add the two naturalistic constraints of distributivity
and witness to get naturalistic part-of structures.
42
For the purposes of semantics we always assume our structures to be complete
structures.
I introduce one more important notion:
Let B = <B,v> be a partial order with minimum 0 and let a  B.
a is an atom in B iff a  0 and for every b  B:
if b v a then either b =0 or b = a
a is an atom in B if there is nothing in between 0 and a.
Let B = <B, v> be a partial order with minimum 0.
B is atomic iff for every b  B{0}: there is an atom a  B such that a v b.
Thus, in an atomic partial order, every element has an atomic part. De facto this
means that when you divide something into smaller and smaller parts, you always end
up with atoms.
Facts: Every finite lattice is atomic.
Every powerset lattice is atomic.
That every finite lattice is atomic is obvious when you think about it: in order for an
element not to have any atomic parts, you need to find smaller and smaller parts all
the way down. That implies infinity.
In the powerset lattice <pow(A),> the singleton sets are the atoms.
Not every infinite lattice is atomic. In fact, infinite lattices, and for that matter,
infinite Boolean lattices exist that are atomless, that have no atoms at all. Even
stronger, complete Boolean lattices exist that are atomless.
We have defined Boolean lattices. We now define Boolean algebras:
A Boolean algebra is a structure B = <B, v, , u, t, 0, 1> such that:
1. <B,v> is a Boolean lattice.
2. : B  B is the operation which maps every b  B onto its complement b
2. u: (BB)B is the operation which maps every a,b  B onto their meet (a u b)
3. t: (BB)B is the operation which maps every a,b  B onto their join (a t b)
5. 0  B is the minimum of B and 1  B is the maximum of B.
43
Let A = <A, vA, A, uA, tA, 0A, 1A> and B = <B, vB, B, uB, tB, 0B, 1B> be Boolean
algebras.
A and B are isomorphic iff there is an isomorphism from A into B.
An isomorphism from A into B is a function h:A  B such that:
1. h is a bijection, a one-one function from A onto B.
2. h preserves the structure from A onto B:
a. for every a  A: h(Aa) = Bh(a)
b. for every a1,a2  A: h((a1 uA a2)) = (h(a1) uB h(a2))
c. for every a1,a2  A: h((a1 tA a2)) = (h(a1) tB h(a2))
d. h(1A) = 1B
e. h(0A) = 0B
Fact: If h is an isomorphism from A into B then the inverse function h1 is an
isomorphism from B into A.
What does the notion of isomorphism tell us? Suppose that there is an isomorphism
between A and an unknown structure B:
A
1
c
b
B
a
?
h
a
b
c
0
The fact that h is a bijection tells us that B has as many elements as A:
A
1
B
c
b
a
a
b
c
d e f
h
h i
k
j
0
The fact that h maps 1 onto 1 and 0 onto 0 tells us that B has a maximum 1 (say j)
and a minimum 0 (say k):
A
1
c
b
B
a
h
a
b
1
d e f
h i
c
0
0
44
The fact that for each element x  A h(x) = h(x) tells us that also in B every
element has a complement.
A
1
B
c
b
a
a
b
c
1
d e f
d e f
h
0
0
Now we start to construct. Map a and b and c onto three of these elements.
-They can't map onto 0 and 1, because we have a bijection.
-Also, we cannot map, say, a onto d and b onto d. The reason is the following:
a t b = c. If h(a)=d and h(b) = d, then h(a) t h(b) = d t d = 1.
But h(a) t h(b) = h(a t b) = h(c). So, then h(c)=1. But then h is not a bijection.
So if, say, we set h(a)=d, we cannot choose d as the value for either b or c. So we
set, say, h(b)=e. Then we cannot set the value for c to e. So let's say we set h(c)=f.
Since a u b = 0, we know that h(a) u h(b) = h(a u b) = h(0) = 0. Hence d u e = 0.
The same argument shows that d u f = 0 and that e u f = 0. So we get:
A
1
c
b
B
1
a
d e f
h
a
b
c
d
e
0
f
0
Now h(a) = h(a) = d
h(b) = h(b) = e
h(c) = h(c) = f
Since a t a = 1, we get:
h(a) t h(b) = h(a t b) = h(1) = 1.
Hence d t e = 1:
The same argument shows that d t f = 1 and e t f = 1
45
So we get:
A
1
B
c
b
a
a
b
c
1
f
e
d
e
d
h
0
f
0
Now a t b = c. Hence h(a tb) = h(c). Hence h(a) t h(b) = h(c).
Hence d t e = f. We determine in the same way that d t f = e and e t f = d and
we get:
A
1
B
c
b
a
a
b
c
1
f
e
d
e
d
h
0
f
0
We see that isomorphic Boolean algebras have the same Boolean structure. They at
most differ in the naming of the elements, but not in the Boolean relations between
the elements. The set theory that the theory of Boolean algebras is formulated in is
extensional. This means that A and B are not literally the same Boolean algebra.
But mathematicians (and we) think of Boolean algebras as more intensional entities.
With the mathematicians we will regard A and B as the same Boolean algebra.
In general, we do not distinguish between isomorphic structures: if A happens to be
an algebra of sets (like a powerset) and B an algebra of, say, functions, they are the
same structure if they are isomorphic. And that means that we are free to think of
them as either sets or functions, whichever is more convenient. As we will see, this
double nature is one of the basic tenets of the functional type theory we have
introduced.
Secondly, I have related A and B through isomorphism h such that h(a)=d and h(b)=e
and h(c)=f. But h is not the only isomorphism between A and B. If I set, as for h,
h'(a)=d, but I set h'(b)=f and h'(c)=e I get another isomorphism between A and B
(by working out the other values):
A
1
c
b
B
a
1
e
f
d
f
d
h'
a
b
c
0
0
46
e
I use the pictures to indicate here which element from A maps onto which element of
B. While this B-picture differs from the previous B-picture (e in the middle versus f
in the middle), the difference lies only in the pictures, not in structure: in pictures of
lattices only the vertical ordering is real (since it corresponds to the order v), the leftright order is only artistic license: if I check which elements are part of which
elements in both B-pictures I get twice the same literal structure, B.
The fact that there is usually more than one isomorphism between two structures
means that we can let convenience determine which isomorphism we use when we
think about these structures as the same thing. This is a second important tenet of the
functional type theory, as we will see.
I will now discuss some theorems and facts that will give us a lot of insight in the
structure of Boolean algebras.
Representation Theorem:
Every complete atomic Boolean algebra is isomorphic to a powerset Boolean
algebra.
Proof sketch:
Let B be a complete atomic Boolean algebra and let ATOMB be the set of atoms in B.
For every b  B let ATb = {a  ATOMB: a v b}
So ATb is the set of atoms below b, the set of b's atomic parts.
Define a function h: B  pow(ATOMB) by:
for every b  B: h(b) = ATb.
What you can prove is that h is an isomorphism between B and
<pow(ATOMB), , , , , Ø, ATOMB>.
So every complete atomic Boolean algebra is isomorphic to the powerset Boolean algebra of the
powerset of its atoms.
This is not very difficult to prove, if you have first proved the following basic fact about complete
atomic Boolean algebras:
Fact: In a complete atomic Boolean algebra every element is the join of its atomic parts:
for every b  B: b = tATb.
The representation theorem tells us that the complete atomic Boolean algebras are just
the powerset Boolean algebras. Since we have already seen that every finite lattice is
complete and atomic, and hence so is every finite Boolean algebra, it follows that:
Corrollaries: The finite Boolean algebras are exactly the finite powerset Boolean
algebras.
More generally:
For each cardinality n there is exactly one complete atomic Boolean
algebra of cardinality 2n. There are no complete atomic Boolean
algebras of other cardinalities.
This follows from the Representation Theorem, Cantor's Theorem, which says that if A has cardinality
n, then pow(A) has cardinality 2n, and an easy theorem which says
that if A and B have the same cardinality, <pow(A),> and <pow(B),> are isomorphic.
47
I will next discuss a theorem that gives a very deep insight in the structure of Boolean
algebras.
The decomposition theorem:
Every Boolean algebra which has an atom can be decomposed into two nonoverlapping isomorphic Boolean algebras.
Proof sketch:
Let B be a Boolean algebra and a an atom in B. Look at the following sets:
Ba = {b  B: a v b}, the set of all elements that a is part of.
Ba = {b  B: b v a}, the set of all parts of of b.
It is not difficult to prove that: Ba  Ba = B and Ba  Ba = Ø. This means that Ba and Ba partition
B.
Now, you can restrict the operations of B to B a and the result is a Boolean algebra Ba with a as 0.
Similarly, Ba forms a Boolean algebra Ba with a as 1.
And you can prove that Ba and Ba are isomorphic.
Example: pick atom a in A. Look at Ba and Ba:
A
1
o
o
a
o
o
a
0
You see the two Boolean algebras sitting in the structure (each isomorphic to
pow({a,b}).
It doesn't matter which atom you use here. In fact, there is a further fact:
Fact: If B is a Boolean algebra and a and b are atoms in B then Ba and Bb are
isomorphic.
That means that in a given Boolean algebra, if you stand at any of its atom and you
look up, the sky looks the same, you see the same structure (up to isomorphism).
So, we also get:
A
o
o
1
A
1
b
o
c
o
o
b
o
o
o
c
0
0
And it doesn't stop here. If Ba has an atom, then by the same theorem, both Ba and
Ba themselves each decompose into two non-overlapping isomorphic Boolean
algebras, so, all in all we have decomposed A into four isomorphic Boolean algebras:
48
A
1
o
o
a
o
o
a
0
The inverse of the decompositiuon theorem is the composition theorem:
First a notion. If R is a relation then [R]TR is the transitive closure of R.
[R]TR = {R': R  R' and R' is transitive}
So [R]TR is the intersection of all transitive relations that R is part of. This
intersection is, of course, itself a relation that R is part of.
Fact: [R]TR is transitive
Proof:
Assume <x,y>  [R]TR and <y,z>  [R]TR.
Then for every R'  {R': R  R' and R' is transitive}: <x,y>  R' and <y,z>  R' (by definition of ).
Then <x,z>  R', because R' is transitive. But that means that <x,z> is in every R'  {R': R  R' and
R' is transitive}, and that means it is in their intersection. Hence, <x,z>  [R]TR, and hence [R]TR is
transitive.
Intuitively [R]TR is the minimal way of turning R into a transitive relation:
for every x,y,z such that R contains <x,y> and <y,z> add to R the pair <z,x>.
This gives you a new relation R1.
Do the same for R1, this gives you a new relation R2. Keep doing this until you end
up with a relation that is transitive. What you end up with is [R]TR, the result of
closing R under transitivity.
The Composition Theorem:
Let B1 and B2 be non-overlapping isomorphic Boolean algebras and h an
isomorphism between B1 and B2.
Define v on B1  B2 by: v = [ vB1  vB2  {<a,h(a)>: a  B1} ]TR
Then <B1  B2, v> is a Boolean lattice.
So you take the order on B1, unite it with the order on B2, add the pairs of the
isomorphism. This puts a directly below h(a) in the new structure. Now you only
need to add transitivity arrows (for every a  B1 and every a1  B1 and every b1  B2
such that a1 vB1 a and h(a) vB2 b1 then you need to add <a1,b1> to v).
The theorem says that the result will be a Boolean algebra.
It is easy to see that its cardinality will be |B1| + |B2|.
This gives you a second procedure to generate all finite Boolean algebras (the first is
as powersets), but this time one that actually gives you a plotting algorithm.
Start with two one element Boolean lattices: <{0},{<0,0>}> and <{1},{<1,1>}>
(These are really degenerate Boolean algebra, since 0 =1. We really start counting at
2.) And the isomorphism h = {<0,1>}. Then the composition theorem tells us how to
form the two element Boolean algebra:
49
0
1-element Boolean algebra: a point.
1
1
0
0
2 1-element Boolean algebras give a 2-element Boolean algebra: a line.
Take two disjoint two element Boolean algebra and an isomorphism: turn the arrow
into a line (the transitive closure adds <0,1>), and you get the 4-element Boolean
algebra:
1
1
a
b
a
0
b
0
2 2-element Boolean algebras give a 4 element Boolean algebra: a square.
Take two disjoint four element Boolean algebras and an isomorphism, form the eight
element Boolean algebra:
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
2 4-element Boolean algebras give an 8 element Boolean algebra: a cube.
Take two disjoint eight element Boolean algebras and an isomorphism, form the
sixteen element Boolean algebra.
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
2 8-element Boolean algebras form a 16 element Boolean algebra: a four dimensional
cube.
50
The key to the relation between the theory of Boolean algebras and the functional type
theory lies in the following theorem:
The Lifting Theorem:
Let B be a Boolean algebra and A a set.
Define on the function space (A  B) the following relation v:
f v g iff for every a  A: f(a) vB g(a)
Then:
1. <(A  B), v> is a Boolean lattice.
2. <(A  B), v> is complete iff B is complete.
3. <(A  B), v> is atomic iff B is atomic.
Thus, if B is a Booelan algebra, you can lift the Boolean structure from B onto the set
of all functions from A into B.
Now we observe the following.
BOOL is the smallest set of types such that:
1. t  BOOL
2. if a  TYPE and b  BOOL then <a,b>  BOOL
The lifting theorem tells us that the domain of every Boolean type has the structure of
a Boolean algebra. It tells us more actually. t is not just a Boolean type, but Dt is
finite, hence the domain Dt forms a complete atomic Boolean algebra. And this
means that the domain of every type in BOOL forms a complete atomic Boolean
algebra.
With that we have proved:
Corrolary: Every Boolean type is isomorphic to a powerset Boolean algebra.
To apply this to the type theory we now mention Cantors cardinality theorems (the
first two of the three facts mentioned, since the third is trivial):
Cantor's Theorems:
1. |pow(A)| = 2|A|
2. |(A  B)| = |B||A|
3. |(A  B)| = |A|  |B|
We are now ready to harvest the fruits of the discussion.
51
D<e,t> is isomorphic to pow(D)
D<e,t> = (D  {0,1})
|(D  {0,1})| = |{0,1}||D| = 2|D| = |pow(D)|
From this, we now know, the above statement follows.
The domain of functions from individuals to truth values is, up to isomorphism, the
domain of sets of individuals.
Thus functions from individuals to truth values are, up to isomorphism, just sets of
individuals. Which sets depends on the choice of the isomorphism.
The isomorphism we choose is the one that is maximally useful for conceptual and
linguistic purposes:
CHAR: pow(D)  D<e,t>
CHAR1: D<e,t>  pow(D)
(CHAR1 = {<X,f>: X  pow(D) and f  D<e,t> and CHAR(f)=X})
CHAR is the function that maps every set onto its characteristic function.
CHAR1 is the isomorphism such that for every function f  D<e,t>:
CHAR1(f) = {d  D: f(d)=1}
Thus, the function which maps every function onto the set characterized by it is an
isomorphism (and so is the function that maps every set onto its characteristic
function).
Example. Let D = {a,b,c}.
D<e,t>
a1
b1
c1
{a,b,c}
a1
b1
c0
a1
b0
c1
a0
b1
c1
{a,b}
{a,c}
{b,c}
a1
b0
c0
a0
b1
c0
a0
b0
c1
{a}
{b}
{c}
a0
b0
c0
Ø
The isomorphisms CHAR and CHAR1 map elements in the same position the cubes
onto each other.
52
We could have taken another isomorphism, say, the function that maps every f onto
{d  D: f(d)=0}. While technically sound, that would only have been confusing for
us. Why? Well, mainly because we are creatures of habit. We are so used to
thinking of a predicate logical formula WALK(j) as expressing the truth conditions of
the natural language sentence John walks, that a logic in which the formula WALK(j)
would express the truth conditions of the natural language sentence John doesn't walk
would bother us. Yet, technically, there wouldn't be anything wrong with such a
logic, as long as the formula WALK(j) would express the truth conditions of the
natural language sentence John walks.
Technically, there is nothing wrong, but practically such a logic would be dreadful.
Our brain wants its calculating systems to be mnemonic and gets dreadfully upset
when the mnemonics is upside down (that is why, in a theory of individuals and
events I change the name of type e to d, so that I can use e for events). Practically
useful logics are mnemonic, and that is the reason (and, given the existence of
different isomorphisms, the only reason) that we make sure that the logical formula
WALK(j) expresses that FM(j)  FM(WALK) and not that FM(j)  FM(WALK).
So the isomorphism we use is CHAR and this means that we idenitify the a function
into truth values with the set of arguments mapped onto 1.
D<e<e,t>> is isomorphic to pow(DD)
D<e,<e,t>> = (D  (D  {0,1))
|D<e,<e,t>>| = |(D  {0,1}||D| = (2|D|)|D|
|pow(DD)| = 2|DD| = 2|D||D|
Now (2n)m = 2nm.
You don't remember?
Well, show first that 2n2m = 2n+m.
That is just a question of counting 2's:
2n2m = (2...2)(2...2) = (2....................2)
n
m
n+m
Given this:
(2n)m = (2n)...(2n) = 2(n + ... + n) = 2nm
m
m
So indeed, D<e,<e,t>> is isomorphic to pow(DD). That means that the domain of
functions from individuals to functions from individuals to truth values is, up to
isomorphism, just the domain of all two place relations, where relations are sets of
ordered pairs.
The difference between the two perspectives becomes maybe clearer if we look at a
third set: ((DD){0,1}).
|((DD){0,1})| = 2|D||D|, so this domain is also isomorphic to pow(DD).
53
That's not a surprise, because we have just above studied the relation between
(A  {0,1}) and pow(A). So ((DD){0,1}) is just the set of all the characteristic
functions of the sets in pow(DD). So, obviously, we can switch between a set of
ordered pairs and the function that maps each ordered pair in that set onto 1 and each
ordered pair not in that set onto 0.
Now look at ((DD){0,1}) and (D  (D  {0,1})).
A function in ((DD){0,1}) takes an ordered pair in DD and maps it onto a truth
value.
A function in (D  (D  {0,1})) takes an individual d in D and maps it onto a
function fd in (D  {0,1}. That function fd takes an individual d' in D and maps it
onto a truth value fd(d'). What the isomorphism says is that the total effect of this is
that in two steps an ordered pair of d' and d is mapped onto a truth value.
In other words:
-a function in ((DD) {0,1}) is a relation which maps two arguments
simultaneously onto a truth value.
An n-place relation takes n arguments.
-We can think of it as a set of n-tuples.
-We can also think of it as a function that takes all the n arguments simultaneously,
and maps them as an n-tuple onto a truth value.
An n-place relation interpreted as a function that takes all n arguments simultaneously
we call an uncurried function.
-We can also think of it as a function that takes the n arguments one at a time,
mapping the first argument in onto an n1 place function, then mapping the second
argument in onto an n2 place function, ..., mapping the n1 th argument onto a 1place function, mapping the last argument onto a truth value.
An n-place relation interpreted as a function that takes the n arguments one by one we
call a curried function.
Though the topic is spicy, the terminology curried has not to do with Indian food, but
with the mathematician Curry who in the Nineteen Thirties developed Combinatorial
Logic (which is one of the characterizations of the notion of algorithmic function,
computable function). Curry based Combinatorial Logic on the technique of currying
functions (turning n-place functions into functions that take the arguments one at a
time). The actual technique of currying functions goes back to a paper by
Schönfinkel in the Nineteen Twenties.
The importance of currying (and uncurrying) for semantics can hardly be
underestimated. You see it directly when you consider the compositionality problem,
the problem of finding interpretations for the constituents:
54
Go back to predicate logic. Take a formula like KISS(MARY, JOHN). The
construction tree in the logical language of this formula is:
FORM
PRED2
TERM
TERM
KISS
MARY
JOHN
Currying predicate logic means that we can take a function CURRY(KISS) which
also takes two arguments, but one at a time: This could give us a construction tree of
the following form:
FORM
PRED1
PRED2
TERM
TERM MARY
CURRY(KISS)
JOHN
And given our considerations about semantic domains, the semantics of
CURRY(KISS) could give you the same interpretation for the sentence.
In fact, to get the same semantics as you have assigned to KISS in predicate logic, the
only thing you need to make sure that is that CURRY maps FM(KISS) onto the right
function.
But, of course, this is eminently useful if we are dealing with the interpretation of
natural language expressions that combine with their arguments one at a time, like
verbs:
S
NP
VP
Mary V
kiss
NP
John
The domain D<e,<e,t>> and pow(DD) are isomorphic. As before, there is more than
one isomorphism between these domains. Which one we choose is determined by
what we want with the theory and again the mnemonics of the logic.
The mnemonics of the theory of relations is, once again, determined by how we are
used to write expressions with relations in predicate logic.
The mnemonics is given by the fact that we are used to regard the truth conditions of
the formula KISS(j,m) as adequate for those of the natural language expression John
kisses Mary. This means that when we require that <j,m>  FM(KISS) we interpret
the first element of the ordered pair <j,m> as the kisser and the second element as
the kissee.
55
This mnemonics is a given fact that we are not going to tinker with.
(The mnemonics is, by the way, much less strong for three place predicates. If I write
a formula GIVE(a,b,c) it is clear to me (and that is the mnemonics) that a is the giver.
But for the other two arguments I would be likely to tell you who's on second, i.e. fix
the convention.)
One third of the connection between D<e,<e,t>> and pow(DD) is going to be fixed by
the mnemonic fact that when we say <j,m>  FM(KISS) we fix j as the kisser.
The second third is fixed by the mnemonics of type theory.
The type <e,<e,t>> is interpreted as the domain (D  (D  {0,1}).
We see two e's in this type, and the left right order of the e's in the type encodes the
order in which the arguments go in a function of this type:
<e,<e,t>>
1
2
This is the fundamental mnemonics of functional type theory: a function of type
<e,<e,t>> combines with an argument of type e (1 = first argument in) to give a
function of type <e,t>. That one combines with an argument of type e (2 = second
argument in) to give a truth value.
This mnemonics was fixed by the definition of Da.
As far as the functions of this type themselves go, we read the arguments from the
inside out.
A curried function f in domain (D  (D  {0,1})) takes two elements p, q  D in
order to get a truth value:
The first argument in gives you a function in (D  {0,1}):
f(p)  (D  {0,1}
The second argument in gives you a truth value in {0,1}:
[(f(p)](q)
This means indeed that in curried functions we count from the inside out. This is
not a mnemonics, this is the way the theory works:
Type:
<e, <e,t>>
1
Argument combination:
2
[(f(1)](2 )
And this generalizes:
The type of three-place relations between individuals is:
<e,<e,<e,t>>>
1
2
3
Again, the numbers indicate the order in which the arguments go in the function.
If f is a function in domain D<e,<e,<e,t>>>, which is (D  (D  (D  {0,1}))), then the
order the argument go in is, again, inside out:
Type:
<e, <e, <e,t>>>
1
Argument combination:
2
3
[ [f(1)](2) ] (3)
56
In general, for n-place functions:
Type:
<e, <e, <... <e, <e,t>>...>>
1
Argument combination:
2
n1
n
[[...[ [f(1)](2) ] ...](n1)](n)
The final third of the connection between D<e,<e,t>> and pow(DD) is not fixed by
mnemonics but by what we are using the theory for. And this final third is precisely
the choice of the isomorphism between the two domains.
For grammatical reasons we semanticists first want to combine the interpretation of
kiss with the interpretation of the object John and then with the interpretation of the
subject Mary.
If we write KISS for the kissing relation in pow(DD), we have already, following the
standard mnemonics, interpreted <j,m>  KISS as j is the kisser and m is the kissee.
If we call the isomorphism from pow(DD) into (D  (D  {0,1})) CURRY, then
these two concerns fix the isomorphism CURRY:
[[CURRY(KISS)](j)](m) = 1 iff <m,j>  KISS
CURRY is the function from pow(DD) into D<e,<e,t>> such that:
for all R  DD: for all d1,d2  D: <d1,d2>  R iff [[CURRY(R)](d2)](d1)=1
Equivalently,
CURRY1 is the function from D<e,<e,t>> into pow(DD) such that:
for every f  D<e,<e,t>>: CURRY1(f) = {<d1,d2>  DD: [f(d2)](d1)=1}
For the type logical language this means that if we see the following expression:
((KISS(JOHN))(MARY))
we should not be misled by the left right order of the arguments in this expression to
think that it means that John kisses Mary. It means that Mary kisses John.
The same holds for the type <e,<e,t>>.
1
2
The numbers I have used here indicate the order in which the arguments
go in and not the order in which they will be sitting in the ordered pair.
Given CURRY, the order in which the arguments go in is precisely the inverse order
in which they sit in the ordered pair. The above formula corresponds to the predicate
logical formula:
KISS( MARY, JOHN )
This may be confusing in the beginning, but there is nothing to be done about it, you
need to get used to it.
57
There is nothing to be done about it, because:
1. We don't want to reinterpret KISS( MARY, JOHN) to mean that John kisses Mary.
That would again be dreadfully confusing.
2. We don't want to change the type-domain relation because, as you will find out, the
powerfulness of type theory as a tool lies precisely in the fact that function-argument
structure can be read off directly from the types.
3. We don't want to use another isomorphism than CURRY between these two
domains for linguistic reasons.
What we do do is introduce relational, uncurried notation as a notation convention in
the language of type logic, because relational notation has one great advantage: it
reduces the amount of brackets.
Given all this we define:
Notation convention: Relational notation for <e,<e,t>>
Let α  EXP<e,<e,t>> and β1,β2  EXPe.
We define:
α(β1,β2) := ((α(β2))(β1))
The notation convention extends to all domains we assume to be linked by
isomorphism CURRY (that is, the isomorphism that works like CURRY works for
<e,<e,t>>).
This means that we understand, for α  EXP<e,<e,<e,t>>>:
α(β1,β2,β3) := (((α(β3))(β2))(β1))
In fact, α can be an expression of type <b,<a,t>> and β2  EXPb and β1  EXPa.
Then we also can write, if we so want, ((α(β2))(β1)) as α(β1,β2).
This applies for instance to verbs like believe. We will later introduce a type of
propositions, which I will here call p. This will be the type of an expression like that
the earth is flat. We will have a constant BELIEVE of type <p,<e,t>> and will form
expressions like ((BELIEVE(β))(JOHN)) which we interpret as John believes β
And, with CURRY we will unproblematically write that in relational notation as
BELIEVE(JOHN,β).
This notation convention is meant to be useful. That means that we apply it when it is
useful to do so. If it is not useful, we don't apply it, or we apply something else.
For instance, we have assumed a constant OLD of type <<e,t>,<e,t>>. And we were
able to form a formula like ((OLD(MAN))(JOHN)).
Now D<<e,t>,<e,t>> is isomorphic to pow(Dpow(D)).
This means that, with CURRY, we could write the above expression in relational
notation: OLD(JOHN,MAN).
There is nothing wrong with that, except that it is conceptually easier to think of
OLD as a function from sets into sets, than as a relation between individuals and sets.
We switch from the functional perspective to the set theoretic perspective when it
helps us interpreting the types: so <<e,t>,<e,t>> is the type of functions from sets
into sets. That's easier than functions from functions into functions. But it isn't
58
really easier than relations between individuals and sets. So we choose the middle
perspective.
There are a few domains where the isomorphism CURRY is not the most natural one.
For those we will blatantly choose another isomorphism and another notation
convention.
D<<e,t>,t> is isomorphic to pow(pow(D))
D<<e,t>,t> is the type of generalized quantifiers. Since <e,t> is (up to isomorphism) the
type of sets of individuals, <<e,t>,t> is the type of functions from sets of individuals
to {0,1}: (pow(D)  {0,1}). Thus a generalized quantifier is a function that takes a
set of individuals and maps it onto a truth value. But, of course, any such function f
characterizes a set: {X  pow(D): f(X)=1}. And by now you should see that this
means that D<<e,t>,t> is isomorphic to pow(pow(D)).
Thus, generalized quantifiers are sets of sets of individuals.
D<<e,t>,<<e,t>,t>> is isomorphic to pow(pow(D)pow(D))
Next, we look at the type <<e,t>,<<e,t>,t>> of determiners. A determiner like
EVERY is a function which maps a set of individuals BOY onto a set of set of
individuals: (EVERY(BOY)). The latter is a function that maps a set of individuals
WALK onto a truth value: ((EVERY(BOY))(WALK)).
This means that EVERY is a function which takes a set of individuals (BOY), then
takes another set of individuals (WALK), and gives a truth value. But that means that
EVERY is a two-place relation between sets of individuals.
In Generalized Quantifier Theory, determiners are interpreted directly as relations
between sets of individuals. We have formulas like
EVERY[BOY,WALK]
and EVERY is interpreted as the subset relation between sets of individuals.
This is, of course, natural, but now we should note that the isomorphism this uses is
not CURRY, but an isomorphism that, for lack of a better term I will call
CORIANDER:
On <e,<e,t>>, CORIANDER is the isomorphism from pow(DD) into D<e,<e,t>> such
that:
For every f  D<e,<e,t>>: CORIANDER1(f) = {<d1,d2>  DD: [f(d1)](d2) = 1}
We generalize this to other types in the same way as CURRY.
So CORIANDER is the obvious isomorphism that we didn't want for type <e,<e,t>>,
but it is the one we want for type <<e,t>,<<e,t>,t>>, and that is, because determiners
as two place relations first combine with the nominal argument, and then with the
verbal argument.
59
If we use CURRY, we should interpret EVERY as the superset relation, and write
((EVERY(BOY))(WALK)) as EVERY(WALK,BOY).
But that is pathetic.
Instead, what we do is not use the relational notation introduced above, but a different
relational notation (with square brackets):
EVERY[BOY,WALK] := ((EVERY(BOY))(WALK))
While both notation conventions are defined for any relational type, we see that their
use is not completely trivial, because they indicate a deeper conceptual difference
between the semantics of determiners and the semantics of transitive verbs (indicated
by the difference in mnemonics).
In general.
Type <a,t> is the type of functions from a-entities into truth values and this is
the type of sets of a-entities.
For some types the set perspective helps, for others it doesn't. Thus:
-<e,t> is usually the type of sets of individuals.
-<<e,t>t> is sometimes the type of functions from sets of individuals to truth values,
sometimes the type of sets of sets of individuals.
-<t,t> is the type of functions from truth values to truth values, the type of one-place
truth functions. We don't often think of negation as the set {0}. The functional
perspective is dominant here.
Type <a,<a,t>> is the type of functions from a-entities into functions from
a-entities into ttuth values and this is the type of relations between a-entities.
-<e,<e,t>> is usually the type of relations between individuals.
-<<e,t>,<<e,t>,t>> is either the type of functions from sets of individuals to sets of
sets of individuals, or the type of relations between sets of individuals.
-<t,<t,t>> is the type of functions from pairs of truth values to truth values, twoplace truth functions, rarely that of relations between truth values.
Type <b,<a,t>> is the type of functions from b-entities into functions from
a-entities into truth values. These can be functions from b-entities into sets of
a-entities, or relations between a-entities and b-entities (under CURRY).
Thus <p,<e,t>> is the type of relations between individuals and propositions.
But <<e,t>,<e,t>> is the type of functions from sets of individuals to sets of
individuals.
In the course of using type theory you will get used to reading types in their intended
interpretation quite automatically. Again, technically, the above choices are arbitrary,
and you can feel free to use a different isomorphic interpretation if it helps you better.
Conceptually, the fact that we naturally think of, say, negation as a function and not at
all as a set may well point at something much more fundamental about how our brains
work.
60
2.5. Interpreting type-logic.
Let us come back to our sentence (1):
(1)
Some old man and every boy kissed and hugged Mary.
We gave the following type logical representation for this sentence:
(((AND1((EVERY(BOY))))((SOME((OLD(MAN))))))
((((AND2(HUGGED))(KISSED))(MARY))))  EXPt
Of this, we specified the following parts:
((AND1((EVERY(BOY))))((SOME((OLD(MAN))))))  EXP<<e,t>,t>
applies to:
(((AND2(HUGGED))(KISSED))(MARY)))  EXP<e,t>
AND1  CON<<<e,t>,t>,<<<e,t>,t>,<<e,t>,t>>>
It applies first to:
(EVERY(BOY))  EXP<<e,t>,t>
and then to:
(SOME((OLD(MAN))))  EXP<<e,t>,t>
EVERY, SOME  CON<<e,t>,<<e,t>,t>>
BOY, (OLD(MAN))  EXP<e,t>
OLD  CON<<e,t>,<e,t> and MAN  CON<e,t>
AND2  CON<<e,<e,t>>,< <e,<e,t>>,<e,<e,t>>>>
It applies first to HUGGED and then to KISSED, both constants of type <e,<e,t>>
The result is:
((AND2(HUGGED))(KISSED))  EXP<e,<e,t>>
which applies to MARY  CONe
The resulting expression is a wellformed expression of type t.
Let M = <D,F> be a model and g an assignment function. Let us calculate the
semantic interpretation of this expression:
v(((AND1((EVERY(BOY))))((SOME((OLD(MAN))))))
((((AND2(HUGGED))(KISSED))(MARY))))bM,g =
(v((AND1((EVERY(BOY))))((SOME((OLD(MAN))))))bM,g
(v(((AND2(HUGGED))(KISSED))(MARY)))bM,g) =
(v((AND1((EVERY(BOY))))((SOME((OLD(MAN))))))bM,g
((v((AND2(HUGGED))(KISSED))bM,g(vMARYbM,g)))) =
(v((AND1((EVERY(BOY))))((SOME((OLD(MAN))))))bM,g
61
((v((AND2(HUGGED))(KISSED))bM,g(F(MARY))))) =
(v((AND1((EVERY(BOY))))((SOME((OLD(MAN))))))bM,g
(((v(AND2(HUGGED))bM,g(vKISSEDbM,g))(F(MARY))))) =
(v((AND1((EVERY(BOY))))((SOME((OLD(MAN))))))bM,g
(((v(AND2(HUGGED))bM,g(F(KISSED)))(F(MARY))))) =
(v((AND1((EVERY(BOY))))((SOME((OLD(MAN))))))bM,g
((((vAND2bM,g(vHUGGEDbM,g))(F(KISSED)))(F(MARY))))) =
(v((AND1((EVERY(BOY))))((SOME((OLD(MAN))))))bM,g
((((F(AND2)(F(HUGGED)))(F(KISSED)))(F(MARY))))) =
((v(AND1((EVERY(BOY))))bM,g(v(SOME((OLD(MAN))))bM,g))
((((F(AND2)(F(HUGGED)))(F(KISSED)))(F(MARY))))) =
((v(AND1((EVERY(BOY))))bM,g((vSOMEbM,g(v(OLD(MAN))bM,g))))
((((F(AND2)(F(HUGGED)))(F(KISSED)))(F(MARY))))) =
((v(AND1((EVERY(BOY))))bM,g((F(SOME)(v(OLD(MAN))bM,g))))
((((F(AND2)(F(HUGGED)))(F(KISSED)))(F(MARY))))) =
((v(AND1((EVERY(BOY))))bM,g((F(SOME)((vOLDbM,g(vMANbM,g))))))
((((F(AND2)(F(HUGGED)))(F(KISSED)))(F(MARY))))) =
((v(AND1((EVERY(BOY))))bM,g((F(SOME)((F(OLD)(F(MAN)))))))
((((F(AND2)(F(HUGGED)))(F(KISSED)))(F(MARY))))) =
(((vAND1bM,g(v(EVERY(BOY))bM,g))((F(SOME)((F(OLD)(F(MAN)))))))
((((F(AND2)(F(HUGGED)))(F(KISSED)))(F(MARY))))) =
(((F(AND1)(v(EVERY(BOY))bM,g))((F(SOME)((F(OLD)(F(MAN)))))))
((((F(AND2)(F(HUGGED)))(F(KISSED)))(F(MARY))))) =
(((F(AND1)((vEVERYbM,g(vBOYbM,g))))((F(SOME)((F(OLD)(F(MAN)))))))
((((F(AND2)(F(HUGGED)))(F(KISSED)))(F(MARY))))) =
(((F(AND1)((F(EVERY)(F(BOY)))))((F(SOME)((F(OLD)(F(MAN)))))))
((((F(AND2)(F(HUGGED)))(F(KISSED)))(F(MARY)))))
Believe it or not, this is a truth value. Why?
Because F(MAN) is a function in D<e,t> and F(OLD) a function in D<<e,t>,<e,t>>, which
means that F(OLD)( F(MAN) ) is a function in D<e,t> again. Since F(SOME) is a
function in D<<e,t>,<<e,t>,t>>, F(SOME)( F(OLD)( F(MAN) ) is a function in D<<e,t>,t>.
F(AND1) takes that function and the function F(EVERY) ( F(BOY) ) and gives you
once again a function f  D<<e,t>,t>.
62
By a similar argument, F(AND2) takes the functions F(HUGGED) and F(KISSED)
and gives a function in D<e,<e,t>>. This function takes F(MARY) and gives a function
g  D<e,t>.
And indeed f(g)  Dt.
This means that we have provided a solution to the compositionality problem. But we
cannot stop here. We have interpreted every expression as a non-logical constant and
the semantics for our type logical puts no constraints on the interpretation of the
constants (except their typing). But that means that we still do worse than predicate
logic, because predicate logic got the truth conditions of our sentence right, and so far
we don't.
The problem is that at the moment we are saying that every boy denotes some set of
sets of individuals, but we are not saying which set. Similarly for some boy. At the
moment, we are saying that and1 and and2 are some functions that respectively map
pairs of sets of sets of individuals onto a set of sets of individuals, and pairs of
relations between individuals onto a relation between individuals, but we are not
saying which functions they denote. And if we want these expressions to have the
semantic properties that they have, we have to constrain their interpretations more.
This we can do by putting constraints on the interpretation function for the models.
That is, instead of assuming that the interpretation of AND1, AND2, EVERY, SOME
can vary from model to model, we fix for every model what their interpretation is.
Similarly, instead of assuming that the interpretation of OLD can vary arbitrary across
models, we assume some constraint on what the interpretation of OLD can be in
different models.
In this way, what we are doing is restricting the class of models for our type logical
language TL to the class of models that satisfy these additional constraints. These
constraints capture aspects of the lexical semantics of the English lexical items that
these constants are used to represent. Hence we can see this restriction as a restriction
from the class of all models for type logic, to the class of all models for English
(where the relevant constants satisfy postulates that are appropriate for the lexical
semantics of the lexical items they represent). Such constraints on the interpretations
of constants used to represent lexical items of English are called meaning postulates.
I will give one example here. Take EVERY  CON<<e,t>,<<e,t>,t>>.
This denoted (up to isomorphism) a relation between sets of individuals. And from
Generalized Quantifier Theory we know which relation, namely the subset relation
.
Which function of D<<e,t>,<<e,t>,t>> is that?
Well, we use the isomorphisms for that:
The function of D<<e,t>,<<e,t>,t>> that corresponds to the subset relation in
pow(DD) is the unique function G in D<<e,t>,<<e,t>,t>> such that:
CORIANDER1(G) = {<f1,f2>: f1,f2  D<e,t> and CHAR1(f1)  CHAR1(f2)}
Where CORIANDER1(G) = {<f1,f2>: [G(f1)](f2) = 1}
63
The meaning postulate now says:
Let TYE, type logical English be a type logical language where the constants we
choose are the ones we find useful for English.
Meaning Postulate on EVERY:
A model for TL M = <D,F> is only a model for TLE if F(EVERY) = G.
With this meaning postulate, we start predicting correct inference patterns, i.e. we
start getting the truth conditions right. We can now show:
v ((EVERY(BOY))(WALK)) bM,g = 1 iff [by the semantics of TL]
[ F(EVERY) ( F(BOY) ) ] (F(WALK)) = 1 iff [by the meaning postulate]
[G (F(BOY)] (F(WALK)) = 1 iff [by the definition of G]
<F(BOY), F(WALK)>  CORIANDER1(G) iff [by the def. of CORIANDER1]
CHAR1(F(BOY))  CHAR1(F(WALK)) iff [by the definition of CHAR1)
{d  D: F(BOY)(d)=1}  {d  D: F(WALK)(d)=1} iff
for every d  D: if F(BOY)(d)=1 then F(WALK)(d)=1.
We can similarly define the interpretations of F(SOME), F(AND1), F(AND2), and
define the interpretation of F(OLD) for OLD  CON<<e,t>,<e,t>> in terms of the
interpretation of F(OLDp) for OLDp  CON<e,t>. Our success in this will depend on
our capacity to define these interpretations correctly. This is not so much a question
of whether such definitions are possible at all: the type theory is rich enough, and we
are optimistic enough about our skills as semanticists, to be confident that such
definitions indeed can be given. The problem is practical: learning to do it. As you
can already see from the interpretation given and the definition of function G, it can
be quite hairy to keep track of which function is which, and which isomorphism we
need to apply at which point.
And, in fact, you don't have to learn this, if you don't want to. And that is because we
equipped the type logical language with a function definition operation, the lambda
operator, which, as we will see, will actually do the hard work for you.
To that we now turn.
64
Download