Notes on Roundness and Negative Type — by Anthony Weston

advertisement
Notes on Roundness and Negative Type — by Anthony Weston
Abstract. These notes are concerned with certain local and global geometric properties of metric spaces.
We will be predominantly interested in finite and countable metric spaces and our starting point will be the
classical work of A. Cayley [1] and I. J. Schoenberg [15, 16, 17] on the geometry of Hilbert space.
Acknowledgments. The final pristine form of these notes would not have been possible without the keen eye
and informed comments of Elena Caffarelli (Department of Mathematics, Rutgers University).
Contents
1.
2.
Metric and Hilbert spaces (the cast and crew)
A Hilbert space prelude to roundness and negative type
1
10
3.
Roundness of a metric space
12
4.
Negative type and generalized roundness
15
5.
Enhanced and strict p-negative type
21
6.
A quantitative lower bound on the generalized roundness of a finite metric space
27
7.
Supremal p-negative type of a finite metric space cannot be strict
30
8.
Appendix One: A closer look at Hilbert spaces
31
References
35
1. Metric and Hilbert spaces (the cast and crew)
We begin with a quote from G. F. Simmons [14, Chapter 2]:
A metric space (as we define it below) is nothing more than a non-empty set equipped with
a concept of distance which is suitable for the treatment of convergent sequences in the set
and continuous functions defined on the set.
Definition 1.1. A metric on a non-empty set X is a function d : X × X → [0, ∞) : (x, y) �→ d(x, y) that
satisfies the following properties:
(M1) d(x, y) = 0 if and only if x = y;
(M2) d(x, y) = d(y, x) for all x, y ∈ X; and
(M3) d(x, y) ≤ d(x, z) + d(z, y) for all x, y, z ∈ X.
A metric space is a pair (X, d) where d is a metric defined on the non-empty set X.
Axiom (M3) is called the triangle inequality and it is intended to generalize the geometrical fact that the
length of any given side of a planar triangle cannot exceed the sum of the lengths of the other two sides.
This exceptionally simple inequality (that can be exceedingly difficult to prove in certain instances) lies at
the heart of modern mathematical analysis and it is rarely weakened or dispensed with. In the following
problem we collate some rudimentary examples of metric spaces.
1
Problem 1.2. In each case verify that d is a metric on the indicated set.
(1) The usual metric on the real line R is defined by d(x, y) = |x − y| for all x, y ∈ R.
(2) The usual metric on the complex plane C is defined by d(z, w) = |z − w| for all z, w ∈ C.
(3) Given a non-empty set X we define the discrete metric on X by
�
1 if x �= y, and
d(x, y) =
0 if x = y.
In (1) and (2), | · | denotes absolute value of a real number and modulus of a complex number, respectively.
Example 1.3. Given a simple connected graph G = (V, E) we may consider the associated (ordinary or
unweighted ) path metric ρ defined on the vertex set V of G as follows. Given two vertices x, y ∈ G, ρ(x, y)
is simply the length of (or minimal number of edges e ∈ E appearing in) a shortest path connecting x and
y in G. Specific examples of graphs that we may choose to endow with the path metric include trees, cycles
and triangulated cycles.1
Naturally, in the context of metric spaces, there is a notion of metric “subspaces” and it is exactly what
you would expect.
Definition 1.4. Let (X, d) be a metric space and let Y be a non-empty subset of X. Then Y inherits a
metric from X simply by restricting d to Y × Y . In other words,
dY (x, y) = d(x, y) for all x, y ∈ Y,
defines a metric on Y . The pair (Y, dY ) is called a metric subspace of (X, d). (In many instances, when
dealing with a metric subspace, we will simply write (Y, d) when in fact we mean (Y, dY ). This should cause
no confusion to the reader.)
The next definition introduces the class of metric spaces that are of central interest in functional analysis.
This definition is predicated in terms of vector spaces over R (or, more generally, over C).
Definition 1.5. A norm on a vector space X over R (or C) is a function � · � : X → [0, ∞) : x �→ �x� that
satisfies the following properties:
(N1) �x� = 0 if and only if x = 0;
(N2) �αx� = |α| · �x� for all x ∈ X and α ∈ R (or C); and
(N3) �x + y� ≤ �x� + �y� for all x, y ∈ X.
A normed space is a pair (X, � · �) where � · � is a norm defined on the vector space X.
Once again we call the axiom (N3) a triangle inequality. One reason for this is the following natural lemma
which points out that every norm on a vector space X induces a corresponding metric on X. Except in the
case of the trivial vector space {0} there are always an infinite number of ways that one can place a norm
on a vector space X provided it admits at least one norm.
Lemma 1.6. Let (X, � · �) be a normed space. Define d(x, y) = �x − y� for all x, y ∈ X. Then d is a metric
on X.
1There will be instances when we weight the edges of graph with arbitrary positive lengths. This produces a weighted path
metric on the graph. See, for example, Definition 5.7.
2
Proof. Axioms (M1) and (M2) follow trivially from the definitions. Let x, y, z ∈ X be given. In view
of axiom (N3) we see that,
d(x, y)
=
�x − y�
=
�(x − z) + (z − y)�
≤
�x − z� + �z − y�
=
d(x, z) + d(z, y),
�
thereby establishing axiom (M3).
Note 1.7. In dealing with a given normed space (X, � · �) it will always be understood that the metric on
X is the one defined in the statement of Lemma 1.6 unless explicitly stated otherwise.
In some instances it is very easy to verify the triangle inequality for a purported norm on a vector space. For
example, given a closed and bounded interval [a, b] ⊆ R, let C[a, b] denote the set of all continuous functions
f : [a, b] → R. Under the usual pointwise operations of addition and scalar multiplication of functions one
readily verifies that C[a, b] is a vector space over R. An example of a particular norm on C[a, b] is given by:
�f �1 =
�b
a
|f (x)|dx, f ∈ C[a, b].
The axioms of a norm are easily verified for this designation simply by invoking basic properties of the
Riemann integral from Calculus I. For example, given f and g in C[a, b], we see that:
�f + g�1
�b
|f (x) + g(x)|dx
≤
�b
(|f (x)| + |g(x)|)dx
=
�b
=
a
a
a
=
|f (x)|dx +
�b
a
|g(x)|dx
�f �1 + �g�1 .
As it turns out, the norm that we have just described on C[a, b] is not nearly the most important one.
A basic result of Real Analysis I stipulates that every f ∈ C[a, b] is bounded insofar as there is a constant
M = Mf > 0 (depending on f ) such that |f (x)| ≤ M for all x ∈ [a, b]. Given that this is the case we may
therefore define another norm on C[a, b], called the supremum norm, by:
�f �∞ = sup |f (x)| = max |f (x)|, f ∈ C[a, b].
x∈[a,b]
x∈[a,b]
Notice that the above supremum is a maximum as indicated. This is because |f | is also continuous.
Problem 1.8. Verify that � · �∞ so defined satisfies the axioms of a norm on the vector space C[a, b].
As pointed out in Lemma 1.6 it then follows that
d(f, g) = �f − g�∞ = sup |f (x) − g(x)|
x∈[a,b]
(f, g ∈ C[a, b]) defines a metric on C[a, b].
3
The quote at the beginning of this section includes the forceful idea that the notion of a metric on a set
is abstracted in such a way that we obtain a sense of proximity which readily lends itself to a discussion of
convergent sequences in the set and continuous functions defined on the set. In the analysis of metric spaces,
the proximity or closeness of points to one another is typically quantified in terms of open and closed balls.
These are defined in the following way.
Definition 1.9. Let (X, d) be a metric space. If x ∈ X and � > 0 we define:
(1) The open ball in X with center x and radius � > 0 to be the set Bx (�) = {y ∈ X : d(x, y) < �}.
(2) The closed ball in X with center x and radius � > 0 to be the set B x (�) = {y ∈ X : d(x, y) ≤ �}.
1.1. Convergence. The definition of convergence of a sequence in a metric space is defined in the same
way as it is in calculus.
Definition 1.10. Let (X, d) be a metric space. Let (xn ) be a sequence in X. The sequence (xn ) is said to
converge to a point x ∈ X if, for each � > 0, there exists a positive integer k such that
d(xn , x) < � for all n ≥ k.
In this case we write lim xn = x or xn → x. The point x is called the limit of the sequence (xn ).
While Definition 1.10 may seem somewhat abstract, it may be expressed purely in terms of the convergence
of sequences of real numbers. Simply notice that a sequence (xn ) converges to x ∈ X if and only if the
sequence of real numbers (d(xn , x)) converges to 0. The condition in Definition 1.10 may also be expressed
in terms of open balls:
... if, for each � > 0, there exists a positive integer k such that xn ∈ Bx (�) for all n ≥ k.
It is not difficult to prove that a sequence (xn ) in a metric space (X, d) converges to at most one point
x ∈ X. (If xn → x and xn → y with x �= y just consider � = d(x, y)/2 > 0 and obtain a contradiction by
using the triangle inequality.) It is also the case that if (xn ) is convergent then the set S = {xn } is bounded.
This (by definition) simply means that sup{d(s, t) : s, t ∈ S} < ∞.
Proceeding, again, in a fashion analogous to the discussion of sequences of real numbers in calculus, the
notion of a Cauchy sequence may be deployed.
Definition 1.11. Let (X, d) be a metric space. Let (xn ) be a sequence in X. The sequence (xn ) is said to
be a Cauchy sequence if, for every � > 0, there exists a positive integer k such that
d(xn , xm ) < � for all n, m ≥ k.
Informally, a sequence is a Cauchy sequence if terms of the sequence can be made arbitrarily close together
(< �) simply by going far enough out in the sequence (n, m ≥ k).
Problem 1.12. Prove that every convergent sequence in a metric space (X, d) is necessarily a Cauchy
sequence.
Critically, a Cauchy sequence need not converge. It depends a lot on the ambient space and the metric under
consideration. For example, the sequence (1/n) does not converge in the metric space ((0, 1], | · |) but it does
converge in the metric space ([0, 1], | · |). You may recall from Real Analysis I that every Cauchy sequence
in the real line (R, | · |) necessarily converges. Metric spaces that have this property are said to be complete.
Definition 1.13. A metric space (X, d) is said to be complete if every Cauchy sequence (xn ) in X converges
to some point x ∈ X.
4
Normed spaces that are complete with respect to the metric induced by the norm are of particular importance
in functional analysis.
Definition 1.14. A Banach space is a normed space that is complete with respect to the metric induced
by the norm. (See Lemma 1.6 and Note 1.7.)
It is a basic fact of functional analysis (which we will not prove here) that every finite dimensional normed
space is complete. So every finite dimensional normed space is a Banach space. The next problem points
out that the normed space (C[a, b], � · �∞ ) is an example of an infinite dimensional Banach space.
Problem 1.15. Prove that the normed space (C[a, b], � · �∞ ) is complete. (Hint: This is not the easiest
problem for the uninitiated. You may find you need to look a couple of things up.)
1.2. Continuity. We now turn to the question of continuity in the context of maps (functions) between
metric spaces. The definition of continuity is based on the familiar � – δ definition from Calculus I.
Definition 1.16. Let (X, d) and (Y, �) be metric spaces. Let f : X → Y be a function. The function f is
said to be continuous at a point x0 ∈ X if, for every � > 0, there exists a δ > 0 such that
�(f (x), f (x0 )) < � whenever d(x, x0 ) < δ.
The function f is said to be continuous if it is continuous at each point of X.
The condition in Definition 1.16 may be expressed rather succinctly in terms of open balls:
�
�
... if, for every � > 0, there exists a δ > 0 such that f Bx0 (δ) ⊂ Bf (x0 ) (�).
Suppose that f : X → Y is a continuous function. One thing to be aware of in relation to Definition 1.16
is that the δ will typically depend not only on f and � but also quite possibly on particular point x0 ∈ X.
Situations where this is not the case are in fact very special and we shall return to this scenario shortly.
One nice property of continuity in the context of metric spaces is that it admits a purely sequential
characterization. (This is not always possible in the context of general topological spaces.)
Theorem 1.17. Let (X, d) and (Y, �) be metric spaces. Let f : X → Y be a function and x0 ∈ X. Then f
is continuous at x0 if and only if for each sequence (xn ) in X such that xn → x0 , f (xn ) → f (x0 ).
Moreover; f is continuous if and only for every convergent sequence (xn ) in X, lim f (xn ) = f (lim xn ).
Proof. (⇒) Suppose that f is continuous at x0 and assume that xn → x0 in X. The assertion is that
f (xn ) → f (x0 ) in Y . To this end, let � > 0 be given. As f is continuous at x0 we may choose a δ > 0 so that
�(f (x), f (x0 )) < �
whenever
d(x, x0 ) < δ.
(1.1)
And as xn → x0 we may also choose a positive integer k so that
d(xn , x0 ) < δ
for all
n ≥ k.
(1.2)
Combining (1.1) and (1.2) we see that �(f (xn ), f (x0 )) < � for all n ≥ k. Thus f (xn ) → f (x0 ).
(⇐) Suppose that for each sequence (xn ) in X such that xn → x0 , f (xn ) → f (x0 ). Assume, to the
contrary, that f is not continuous at x0 . Hence there exists an �0 > 0 such that for all δ > 0 there exists
an x ∈ X satisfying d(x, x0 ) < δ and �(f (x), f (x0 )) ≥ �0 . In particular, for each positive integer n, we
may take δ = 1/n and choose xn ∈ X so that d(xn , x0 ) < 1/n and �(f (xn ), f (x0 )) ≥ �0 . By construction,
xn → x0 in X (by the Archimedean principle) but f (xn ) �→ f (x0 ) in Y . This contradicts our supposition.
The second part of the theorem is an immediate consequence of the first part.
5
�
The sequential characterization of continuity can be used to prove many basic general facts about continuous
functions such as the following routine theorem.
Theorem 1.18. Let X, Y and Z be metric spaces.
(1) If f : X → Y and g : Y → Z are continuous functions, then the composite g ◦ f : X → Z is also
continuous.
(2) If f : X → Y is a continuous function and if W ⊆ X is a metric subspace of X, then the restriction
of f to W — namely, the function f|W : W → Y — is also continuous.
Problem 1.19. Prove Theorem 1.18.
1.3. Uniform continuity. The proof that a continuous real-valued function defined on a closed and
bounded interval is actually Riemann integrable relies on the substantial fact that such functions are “uniformly continuous”. This notion is usually only developed and properly explained in an advanced calculus
course such as Real Analysis. Like continuity, the notion of uniform continuity may be formulated purely
in terms of maps between general metric spaces. Unlike continuity, uniform continuity is defined globally
in terms of the entire domain of the function at hand. One cannot speak of a function as being uniformly
continuous at a point.
Definition 1.20. Let (X, d) and (Y, �) be metric spaces. A function f : X → Y is said to be uniformly
continuous if, for every � > 0, there exists a δ > 0 such that �(f (x), f (z)) < � for all x, z ∈ X satisfying
d(x, z) < δ.
By fixing z = x0 ∈ X in Definition 1.20, and referring back to Definition 1.16, we see that every uniformly
continuous function is, perforce, continuous. Examples, such as the following, make plain that continuous
functions need not be unformly continuous.
Example 1.21. Consider the real line R endowed with its usual metric: d(x, y) = |x − y| for all x, y ∈ R.
The familiar function f : R → R : x �→ x2 (f (x) = x2 ) is well known to be continuous. It is, however, not
uniformly continuous as we shall now set out to prove.
Proof. We set � = 1 and consider an arbitrary δ > 0. The idea is to produce x, z ∈ R such that
|x − z| < δ but |f (x) − f (z)| = |x2 − z 2 | > � = 1. As the domain of f is the entire real line R we may set
x = 1/δ + δ/2 and z = 1/δ. Then |x − z| = δ/2 < δ but |x2 − z 2 | = |1 + δ 2 /4| > 1. We conclude that f is
�
not uniformly continuous.
Notice in the preceding proof that x → ∞ as δ → 0+ . So the given proof does not apply if one restricts f
to a bounded interval. Indeed, if we restrict f (x) = x2 to a bounded interval, then it is in fact uniformly
continuous. The reader may care to verify this assertion directly as an exercise.
Example 1.22. Consider the real line R endowed with its usual metric: d(x, y) = |x − y| for all x, y ∈ R.
We claim that the (everywhere) differentiable function
f : R → R : x �→
x
1 + x2
is uniformly continuous.
Proof. We first note that if x < z, then the Mean Value Theorem from Calculus I implies that there
exists a t ∈ (x, z) such that
�
�
� 1 − t2 �
�·|x − z| ≤ |x − z|.
|f (x) − f (z)| = |f � (t)| · |x − z| = ��
(1 + t2 )2 �
6
(The inequality follows because |f � (t)| ≤ 1 by inspection.) The assertion of uniform continuity is now rather
�
obvious. Given an arbitrary � > 0 we may simply choose δ = �.
(n)
1.4. The Hilbert spaces �2
and �2 (n ≥ 2). A very special class of normed vector spaces arise in
the context of certain inner product spaces. Without going into excessive generality, we will introduce two
(n)
of the most important such ringmasters: �2
and �2 . These objects are examples of “Hilbert spaces”. (They
are formally defined in Definition 1.28 below.) We first recall (from basic linear algebra) the notion of a real
inner product space.2
Definition 1.23. Let V be a real vector space. An inner product on V is a function
(· � ·) : V × V → R : (x, y) �→ (x � y)
that satisfies the following properties:
(IP1) (x � y) = (y � x) for all x, y ∈ V .
(IP2) (λx � y) = λ(x � y) for all λ ∈ R and all x, y ∈ V .
(IP3) (x + y � z) = (x � z) + (y � z) for all x, y, z ∈ V .
(IP4) (x � x) ≥ 0 for all x ∈ V .
(IP5) If (x � y) = 0 for all y ∈ V , then x = 0.
Every inner product on a real vector space induces a corresponding norm on the vector space in a natural
way. Norms induced in this way tend to have very special geometrical properties. Again, without going into
excessive detail, we will illustrate what we mean with a fundamental and familiar example.
Example 1.24. The real vector space R(n) of all n-tuples of real numbers, equipped with the following
natural inner product
(x � y) =
n
�
x j yj ,
j=1
provides the simplest example of a so-called Hilbert space. Here x = (x1 , x2 , . . . , xn ) and y = (y1 , y2 , . . . , yn ).
According to this inner product, the Euclidean norm of a vector x = (x1 , x2 , . . . , xn ) ∈ R(n) is defined by:
� n
� 12
�
1
�x�2 = (x � x) 2 =
x2j
.
j=1
It is clear that � · �2 satisfies the norm axioms (N1) and (N2) but the triangle inequality (N3) is less evident.
We will see that axiom (N3) may be derived as a consequence of the following lemma.
Lemma 1.25 (Cauchy-Schwarz Inequality). Let x = (x1 , x2 , . . . , xn ), y = (y1 , y2 , . . . , yn ) ∈ R(n) be given.
Then |(x � y)| ≤ �x�2 · �y�2 . In other words:
� n
�
��
�
�
�
x i yi � ≤
�
�
�
i=1
�
n
�
i=1
|xi |
2
� 12 �
·
Proof. Given fixed vectors x, y ∈ R(n) simply note that
n
�
i=1
|yi |
2
� 12
.
0 ≤ (tx + y � tx + y) = t2 (x � x) + 2t(x � y) + (y � y)
is a non negative real quadratic in the real variable t ∈ R. Thus the discriminant of this quadratic, namely
4(x � y)2 − 4(x � x)(y � y), is non positive. The lemma is now plainly evident.
�
2A discussion of complex inner product and Hilbert spaces is included in Appendix 8. This appendix is for students who want
to know a little bit more about Hilbert spaces in general.
7
Problem 1.26. Write a more explicit proof of Lemma 1.25 in the following manner. Given x = (x1 , x2 , . . . , xn ),
and y = (y1 , y2 , . . . , yn ) ∈ R(n) , simply verify that
0≤
n
�
i,j=1
2
(xi yj − xj yi ) = 2�x�2 · �y�2 − 2
�
n
�
x i yi
i=1
�2
.
This proof of Lemma 1.25 has the distinct advantage of being intuitive.
Corollary 1.27. For all x = (x1 , x2 , . . . , xn ), y = (y1 , y2 , . . . , yn ) ∈ R(n) , we have �x + y�2 ≤ �x�2 + �y�2 .
Proof. In view of our basic definitions and the Cauchy-Schwarz inequality (Lemma 1.25) we see that:
�x + y�22
=
n
�
i=1
=
n
�
|xi + yi |2
x2i
+2
i=1
�x�22 + 2
≤
�x�22
=
x i yi +
i=1
=
�
n
�
n
�
i=1
n
�
yi2
i=1
xi yi + �y�22
+ 2 · �x�2 · �y�2 + �y�22
�2
�x�2 + �y�2 .
Taking square roots secures the desired outcome.
�
So we are now in a position to say that the pair (R(n) , �·�2 ) is a normed space. In the mathematical literature
(n)
this normed space is often denoted by �2
and we will use this notation throughout the remainder of this
document. Notice, in particular, that � · �2 induces the standard Euclidean metric ρ on R(n) :
� n
� 21
�
2
ρ(x, y) = �x − y�2 =
(xi − yi )
for all x = (x1 , x2 , . . . , xn ), y = (y1 , y2 , . . . , yn ) ∈ R(n) .
i=1
As
(n)
�2
(n ∈ N) is a finite dimensional normed space it is automatically a Banach space. (See the
comment after Definition 1.14.) With this observation in mind, we are now in a position to formally define
what we mean by a Hilbert space in a somewhat passable way. (A more technically correct approach is taken
in Appendix 8. See Definition 8.8.)
Definition 1.28. A complete normed space (X, � · �) is said to be a Hilbert space if there exists an inner
product (· � ·) on X such that
�x� = (x � x)1/2 for all x ∈ X.
So a Hilbert space is a Banach space with some additional geometrical structure; namely, the norm on a
Hilbert space is derived from an inner product and the resulting metric space induced (in turn) by this norm
is, additionally, complete.
(n)
The comments that precede this definition make it plain that �2
is a Hilbert space for each n ∈ N. These
finite dimensional Hilbert spaces have an infinite dimensional big sister of particular importance that we will
describe as an example only (without proof).
Example 1.29. Let S2 denote the real vector space of all square summable sequences of real numbers3
endowed with the usual pointwise vector operations (x + y = (xj + yj ), and so on). The Hilbert space �2
3Recall that a sequence (x ) of real numbers is said to be square summable if the infinite series � |x |2 converges.
n
n
8
consists of S2 endowed with the norm that is derived from the following natural inner product:
(x � y) =
∞
�
j=1
xj yj ; x = (xj ), y = (yj ) ∈ S2 .
(1.3)
This inner product on S2 induces the �2 -norm:

1/2
∞
�
�x�2 = 
|xj |2 
for all x = (x1 , x2 , . . . , xn , . . .) ∈ S2 .
j=1
We state without proof that �2 = (S2 , � · �) is a Hilbert space.
Problem 1.30. If (xn ) and (yn ) are two arbitrary square summable sequences of real numbers, prove that
�
the infinite series
xj yj converges.
j
(n)
1.5. �p
(n)
and �p -spaces in general (1 ≤ p ≤ ∞). The Hilbert spaces �2
and �2 are specific repre-
sentatives of a more general class of Banach spaces. We will briefly describe the stucture of these Banach
spaces without going into excessive detail.
(n)
Example 1.31. Let 1 ≤ p < ∞. The Banach space �p
the norm � · �p defined by
�x�p =
n
�
�
i=1
|xi |p
�1/p
consists of the real vector space R(n) endowed with
for all x = (x1 , x2 , . . . , xn ) ∈ R(n) .
(n)
(n)
In the case p = 2 we obtain the Hilbert space �2 . If p �= 2 then �p
is not a Hilbert space. (You may care
to try to prove this.)
(n)
Example 1.32. Let p = ∞. The Banach space �∞ consists of the real vector space R(n) endowed with the
norm � · �∞ defined by
� �
�x�∞ = max �xi � for all x = (x1 , x2 , . . . , xn ) ∈ R(n) .
1≤i≤n
The preceding two examples provide a rich class of finite dimensional normed spaces. These finite dimensional
normed spaces have infinite dimensional analogues that are of great importance in modern functional analysis.
In a nutshell, we have the following two examples.
Example 1.33. Let 1 ≤ p < ∞. Let Sp denote the real vector space of all p-summable sequences of real
numbers4 endowed with the usual pointwise vector operations (x + y = (xj + yj ), and so on). The Banach
space �p consists of the real vector space Sp endowed with the norm � · �p defined by
�x�p =
∞
�
�
i=1
|xi |p
�1/p
for all x = (x1 , x2 , . . . , xn , . . .) ∈ Sp .
In the case p = 2 we obtain the Hilbert space �2 . If p �= 2 then �p is not a Hilbert space. (You may care to
try to prove this.)
Example 1.34. Let p = ∞. Let S∞ denote the real vector space of all bounded sequences of real numbers5
endowed with the usual pointwise vector operations (x + y = (xj + yj ), and so on). The Banach space �∞
4Recall that a sequence (x ) of real numbers is said to be p-summable if the infinite series � |x |p converges.
n
n
5Recall that a sequence (x ) of real numbers is said to be bounded if there exists a constant M ≥ 0 such that |x | ≤ M for all
n
n
natural numbers n.
9
consists of the real vector space S∞ endowed with the norm � · �∞ defined by
� �
�x�∞ = sup�xi � for all x = (x1 , x2 , . . . , xn , . . .) ∈ S∞ .
i∈N
The spaces described in the preceding four examples are special instances of an even wider class of Banach
spaces, the so called Lp -spaces, that arise in the theory of measure and integration. A very brief discussion
of one of these spaces, the Hilbert space L2 [a, b], occurs in Appendix 8 (Example 8.12).
2. A Hilbert space prelude to roundness and negative type
We now get down to brass tacks by discussing the historical origins of the notions of roundness and
negative type in modern mathematical analysis. The starting point is a well-known geometrical identity.
�
�
(n)
Theorem 2.1 (Parallelogram Identity). For all vectors x, y ∈ �2 , �x + y�22 + �x − y�22 = 2 �x�22 + �y�22 .
This basic identity holds in any Hilbert space and it is a first hint that Hilbert space has “roundness two”.
(n)
In fact (see Lemma 3.9), given any quadruple of points x00 , x01 , x11 , x10 ∈ �2 , we have:
ρ(x00 , x11 )2 + ρ(x01 , x10 )2
≤
ρ(x00 , x01 )2 + ρ(x01 , x11 )2 + ρ(x11 , x10 )2 + ρ(x10 , x00 )2
(2.1)
with equality if and only if the points x00 , x01 , x11 , x10 form the four corners of a (possibly degenerate)
parallelogram with diagonal pairs (x00 , x11 ) and (x01 , x10 ). Moreover, p = 2 is the largest possible exponent
(n)
for which (2.1) holds for all choices of points x00 , x01 , x11 , x10 ∈ �2 . Indeed, suppose p > 2 and consider the
(n)
specific points x00 = (0, 0, 0, . . . , 0), x01 = (0, 1, 0, . . . , 0), x11 = (1, 1, 0, . . . , 0), x10 = (1, 0, 0, . . . , 0) ∈ �2 .
Then,
ρ(x00 , x11 )p + ρ(x01 , x10 )p
because
2+p
2
=
√
2( 2)p
=
2
>
ρ(x00 , x01 )p + ρ(x01 , x11 )p + ρ(x11 , x10 )p + ρ(x10 , x00 )p
=
4,
2+p
2
> 2. The fact that p = 2 is the largest exponent for which the inequality (2.1) holds for all
(n)
choices of points x00 , x01 , x11 , x10 ∈ �2
is the real reason why we say that Hilbert space has roundness two.
(n)
Problem 2.2. Derive the parallelogram identity for arbitrary vectors x, y ∈ �2 .
Recall that an isometry in mathematics is any function (or map) that preserves the distances between
pairs of points. Let’s assume that we have a finite metric space (X, d) = ({x0 , x1 , x2 , . . . , xn }, d) with n ≥ 2
(n)
that is assumed to embed isometrically into the Hilbert space �2 . In other words, assume there exist points
(n)
y 0 , y 1 , . . . , y n ∈ �2
such that �yi − yj �2 = d(xi , xj ) for all i, j (0 ≤ i, j ≤ n). Without any loss of generality
(n)
we may assume that y0 = (0, 0, 0, . . . , 0). This is because metric distances in �2
are translation invariant.
Now observe the following computation:
d(xi , xj )2
=
=
=
=
=
=
�yi − yj �22
�
�
(yi − y0 ) + (y0 − yj ) � (yi − y0 ) + (y0 − yj )
�yi − y0 �22 + 2(yi − y0 � y0 − yj ) + �y0 − yj �22
d(xi , x0 )2 + 2(yi − y0 � y0 − yj ) + d(x0 , xj )2
d(xi , x0 )2 − 2(yi � yj ) + d(x0 , xj )2 + terms of the form ± 2(y0 � �)
d(xi , x0 )2 − 2(yi � yj ) + d(x0 , xj )2 .
10
The last equality follows because we have assumed that y0 = (0, 0, . . . , 0) at the outset. Hence:
�
1�
(yi � yj ) =
d(x0 , xi )2 + d(x0 , xj )2 − d(xi , xj )2
2
for all i, j (1 ≤ i, j ≤ n). So if η1 , η2 , . . . , ηn are any real numbers, it follows from (2.2) that
��
�2
� n
�
�
0 ≤ �
ηi y i �
�
2
i=1
=
=
�
n
�
i=1
n
�
i,j=1
=
ηi y i �
n
�
ηi y i
i=1
(2.2)
(2.3)
�
(yi � yj )ηi ηj
n
�
1 ��
d(x0 , xi )2 + d(x0 , xj )2 − d(xi , xj )2 ηi ηj .
2 i,j=1
This means we have proven the forward implication of the following fundamental theorem.
Theorem 2.3 (Schoenberg [16]). Let n ≥ 2. Let (X, d) = ({x0 , x1 , x2 , . . . , xn }, d) be a finite metric space.
(n)
Then (X, d) embeds isometrically into the Hilbert space �2
0
≤
if and only if
n
�
1 ��
d(x0 , xi )2 + d(x0 , xj )2 − d(xi , xj )2 ηi ηj
2 i,j=1
(2.4)
for all choices of real numbers η1 , η2 , . . . , ηn .
The proof of the reverse implication in Theorem 2.3 follows from properties of real symmetric n × n matrices.
Remark 2.4. The condition (2.4) on the metric space (X, d) in the statement of Theorem 2.3 may be
expressed in a more symmetric (and, ultimately, far more helpful) way:
n
�
d(xi , xj )2 ηi ηj
≤
i,j=0
for all choices of real numbers η0 , η1 , . . . , ηn such that
n
�
0
(2.5)
ηi = 0.
i=0
�
�
Proof. To see (2.4) and (2.5) are equivalent, note that the n×n matrix (d(x0 , xi )2 +d(x0 , xj )2 )ηi ηj i,j
is symmetric and sum accordingly to get:
n �
�
i,j=1
�
d(x0 , xi )2 + d(x0 , xj )2 − d(xi , xj )2 ηi ηj
=
2
�
−
=
=
where η0 = −
n
�
n
�
i=1
n
�
n
�
d(x0 , xj )2 ηj
j=1
�
n
�
�
(2.6)
d(xi , xj )2 ηi ηj
i,j=1
−2η0
−
ηi
��
n
�
j=1
2
�
d(x0 , xj ) ηj −
n
�
d(xi , xj )2 ηi ηj
i,j=1
d(xi , xj )2 ηi ηj ,
i,j=0
ηi (by definition or assumption depending on the direction of the implication).
i=1
11
�
Problem 2.5. Verify the first equality in (2.7) by expanding a few terms on the left side in order to determine
the particular order of summation that leads directly to the expression given on the right side.
3. Roundness of a metric space
The notion of the (maximal) roundness of a general metric space was introduced by Per Enflo [3, 4, 5, 6].
The previous section gave a hint as to how roundness may be defined. The formal definition is as follows.
Definition 3.1. Let p ≥ 1 and let (X, d) be a metric space. We say that p is a roundness exponent of (X, d),
denoted by p ∈ r(X) or p ∈ r(X, d), if and only if for all quadruples of points x00 , x01 , x11 , x10 ∈ X, we have:
d(x00 , x11 )p + d(x01 , x10 )p
≤
d(x00 , x01 )p + d(x01 , x11 )p + d(x11 , x10 )p + d(x10 , x00 )p .
(3.1)
The (maximal) roundness of (X, d), denoted by p(X, d), is then defined to be the supremum of the set of all
roundness exponents of (X, d):
p(X, d) = sup{p : p ∈ r(X, d)}.
This supremum will simply be denoted p(X) when the underlying metric d is clear from the context.
It is not difficult to see that 1 is a roundness exponent for every metric space (X, d). Indeed, choose
an arbitrary quadruple of points x00 , x01 , x11 , x10 in a given metric space (X, d). Now apply the triangle
inequality four times as follows:
d(x00 , x11 )
≤
d(x00 , x01 ) + d(x01 , x11 ),
d(x00 , x11 )
≤
d(x10 , x00 ) + d(x11 , x10 ),
d(x01 , x10 )
≤
d(x00 , x01 ) + d(x10 , x00 ), and
d(x01 , x10 )
≤
d(x01 , x11 ) + d(x11 , x10 ).
Upon adding these four inequalities and dividing by two we obtain:
d(x00 , x11 ) + d(x01 , x10 )
≤
d(x00 , x01 ) + d(x01 , x11 ) + d(x11 , x10 ) + d(x10 , x00 ).
Hence 1 ∈ r(X, d). In particular, p(X, d) ≥ 1. This simple estimate makes it easy to compute the maximal
roundness of certain natural metric spaces. As an example, we consider the non-simply connected metric
space S 1 . Recall that S 1 denotes the unit circle x2 + y 2 = 1 endowed with the arc length metric d.
Lemma 3.2 (Lafont and Prassidis [9], Hjorth et al. [7]). The maximal roundness of S 1 is 1.
Proof. We know that p(S 1 ) ≥ 1. Let p be a roundness exponent of S 1 . Consider the points x00 =
(1, 0), x01 = (0, 1), x11 = (−1, 0), x10 = (0, −1) ∈ S 1 . The roundness exponent p inequality (3.1) for this
quadruple of points xij ∈ S 1 is 2 · π p ≤ 4 · (π/2)p . This shows that 2p ≤ 2. Therefore p ≤ 1 and so
�
p(S 1 ) = 1.
Remark 3.3. Lemma 3.2 shows that any metric space which contains an isometrically embedded circle (of
any positive radius) must have maximal roundness one. For example, any geodesic metric space that admits
a globally minimizing closed geodesic has this property. The following two problems generalize Lemma 3.2.
Problem 3.4. Let p > 1. Prove that (x + y)p > xp + y p for all x, y > 0.
Problem 3.5. Let (X, d) be a metric space that contains 4 points z1 , z2 , z3 , z4 such that d(z1 , z3 ) = d(z1 , z2 )+
d(z2 , z3 ) = d(z3 , z4 ) + d(z4 , z1 ) = d(z2 , z4 ). Prove that p(X, d) = 1. (Hint: The previous problem is relevant.)
12
One particularly pertinent application of Problem 3.5 concerns the Banach space �1 . The points z1 =
(0, 0, 0, . . .), z2 = (0, 1, 0, . . .), z3 = (1, 1, 0, . . .), z4 = (1, 0, 0, . . .) ∈ �1 satisfy the condition of Problem 3.5.
(n)
Clearly the same comments apply to the finite dimensional cousins �1 , n ≥ 2. In summary:
(n)
Lemma 3.6. The Banach spaces �1
(for any n ≥ 2) and �1 have maximal roundness 1.
The following problem justifies the use of the word “maximal” in Definition 3.1.
Problem 3.7. Let (X, d) be a given metric space. Prove that r(X, d) is a closed subset of [1, ∞). Deduce
that if p(X) < ∞, then p(X) is a roundness exponent of (X, d).
There are many situations where p(X, d) < ∞. For example, if a metric space (X, d) contains distinct points
x, y, z such that d(x, z) =
1
2 d(x, y)
= d(z, y), then (3.1) is not satisfied for all quadruples if the exponent
p > 2. For example, set x00 = x, x11 = y and x01 = z = x10 . For this quadruple of points, inequality (3.1)
reduces to
d(x, y)p
with d(x, y) > 0,
2p
p
and clearly the resulting inequality 2 ≤ 4 does not hold for any p > 2.
d(x, y)p ≤ 4 ·
Remark 3.8. The condition d(x, z) =
1
2 d(x, y)
= d(z, y) considered in the preceding paragraph is a mild
one. It just says that z is a metric midpoint between x and y. In particular, a metric space (X, d) is said to
be midpoint convex if every pair of points x, y ∈ X has at least one metric midpoint z ∈ X between x and
y. So the bottom line is that no midpoint convex metric space can have maximal roundness p > 2.
The next lemma can be generalized to show that every Hilbert space H has maximal roundness p(H) = 2.
(n)
Lemma 3.9. The Hilbert space �2
(n)
Proof. Clearly �2
has maximal roundness 2.
� (n) �
is midpoint convex, so p �2 ≤ 2 by Remark 3.8. The proof will be complete if we
(n)
can show that 2 is a roundness exponent of �2 .
However, ρ2 has a very special form that will make it easy to establish (3.1) in the case p = 2. Namely;
ρ(x, y)2 =
n
�
j=1
(n)
(xj − yj )2 for all x = (x1 , . . . , xn ), y = (y1 , . . . , yn ) ∈ �2 .
(n)
Now consider an arbitrary quadruple of points x00 , x01 , x11 , x10 ∈ �2
and let xkl (j) denote the j-th coordi-
nate of xkl . It suffices to prove that
(x00 (j) − x11 (j))2 + (x01 (j) − x10 (j))2
≤
(x00 (j) − x01 (j))2 + (x01 (j) − x11 (j))2
(3.2)
+(x11 (j) − x10 (j))2 + (x10 (j) − x00 (j))2
for each j, 1 ≤ j ≤ n. (For we may then add the resulting n inequalities to obtain the desired outcome.) By
setting z1 = x00 (j), z2 = x01 (j), z3 = x11 (j), z4 = x10 (j), we derive (3.2) simply by noting that
(z1 − z2 )2 + (z2 − z3 )2 + (z3 − z4 )2 + (z4 − z1 )2 − (z1 − z3 )2 − (z2 − z4 )2 = (z1 − z2 + z3 − z4 )2 ≥ 0.
�
3.1. Skew cubes. The maximal roundness of a metric space satisfies a beautiful inductive property
which we now set out to describe. The results of this section appear in Enflo [3]. The starting point is
to introduce the notion of a skew cube in an arbitrary metric space. The basic idea is to encode “higher
dimensional” cubes in an arbitrary metric space in a natural way.
13
Definition 3.10. Let n ≥ 2 be a natural number. By a skew cube or n-cube in a metric space (X, d) we
simply mean the encoded range N = {xε } of a function f : {0, 1}n → X : ε �→ xε whose domain is the
standard n-dimensional cube of all n-vectors ε = (ε1 , . . . , εn ) with coordinates chosen from the set {0, 1}.
An unordered pair of vertices (xε , xδ ) in an n-cube N is called a diagonal if εi �= δi for all i ∈ {1, 2, . . . , n},
and an edge if εi �= δi for precisely one i ∈ {1, 2, . . . , n}. In other words, a diagonal in N is, by definition,
the image under f of an ordinary diagonal in the standard n-cube {0, 1}n , and so on.
Notation 3.11. Given an n-cube N = {xε } in a metric space (X, d), the set of all diagonals in N will
be denoted D(N ), and the set of all edges in N will be denoted E(N ). Clearly |D(N )| = 2n−1 , and
|E(N )| = n · 2n−1 . Moreover, for any unordered pair of vertices f = (xε , xδ ) in N we will use l(f ) as a
shorthand for the metric length d(xε , xδ ). This allows for an efficient method of writing down roundness
related inequalities. For example, we can view condition (3.1) in Definition 3.1 as a statement about all
2-cubes N = {xij } in X:
�
d∈D(N )
l(d)p ≤
�
l(e)p .
e∈E(N )
With this terminology and notation in mind we develop one of the most striking properties of roundness.
This result is known to have fundamental applications in a number of mathematical fields.
Theorem 3.12 (Enflo [3]). If N is an n-cube in a metric space (X, d) that has maximal roundness p, then:
�
�
l(d)p ≤
l(e)p .
(3.3)
d∈D(N )
e∈E(N )
In particular; if dmin denotes a diagonal of minimal d-length in N and emax denotes an edge of maximal
1
d-length in N , then we must have l(dmin ) ≤ n p · l(emax ).
Proof. Let (X, d) be a metric space of maximal roundness p < ∞. The proof proceeds by induction.
The theorem is true for n = 2 by Definition 3.1 and Problem 3.7. We assume that the theorem is true for
n − 1 and then deduce (3.3) for an arbitrary n-cube N in (X, d).
Let such an n-cube N = {xε } ⊆ X be given. To streamline the notation of the proof we identify each
point xε ∈ X with its vectorial subscript ε. In other words (1, 1, . . . , 1) now denotes the point x(1,1,...,1) , and
so on. We further let E = E(N ) denote the set of edges and D = D(N ) denote the set of diagonals of the
n-cube N . As noted above, |E| = n · 2n−1 and |D| = 2n−1 .
The next step is to partition N into two parts N0 and N1 . We let N0 consist of all ε = (ε1 , ε2 , . . . , εn ) ∈ N
such that εn = 0. Similarly, N1 will consist of all ε = (ε1 , ε2 , . . . , εn ) ∈ N such that εn = 1. In particular,
all points in N0 are of the form (ε1 , . . . , εn−1 , 0). This means that we may view N0 as an (n − 1)-cube by
ignoring the last coordinate (0). Similarly, N1 naturally presents itself as an (n − 1)-cube too. Let Ej and
Dj denote the edge and diagonal sets of Nj , respectively (j = 0, 1). Note that the edges in Nj are edges in
N but that the same cannot be said of the diagonals in Nj . No diagonal of Nj is a diagonal of N (j = 0, 1).
Even so, by our inductive hypothesis, the inequalities (3.3) hold for N0 and N1 , and so
�
�
l(d)p ≤
l(e)p , for j = 0, 1.
d∈Dj
(3.4)
e∈Ej
The trick now is to incorporate the “missing” edges and diagonals from N into our argument by bridging
the void between N0 and N1 .
Let E01 = E \ (E0 ∪ E1 ) be the edges of N that lie between the subcubes N0 and N1 . Let (ε, δ) be a
given diagonal of the n-cube N . We may form the opposite diagonal (ε̄, δ̄) where ε̄ is obtained from ε by
14
changing the last coordinate of ε only. Ditto for the construction δ̄ from δ. Now consider the following 2-cube:
x00 = ε, x11 = δ, x01 = ε̄, x10 = δ̄ ∈ X. There are 2n−2 distinct 2-cubes that can be constructed this way and
each one satisfies (3.1). In each case, the form of (3.1) is l(d1 )p +l(d2 )p ≤ l(e1 )p +l(e2 )p +l(f0 )p +l(f1 )p where
d1 , d2 ∈ D, e1 , e2 ∈ E01 , f0 ∈ D0 and f1 ∈ D1 . Moreover, each edge and diagonal from the sets D, E01 , D0
and D1 appear precisely once in such an inequality. Adding these 2n−2 inequalities we thus obtain (3.3):
�
�
�
�
l(d)p ≤
l(e)p +
l(d)p +
l(d)p
e∈E01
d∈D(N )
�
≤
d∈D0
l(e)p +
e∈E01
�
=
�
d∈D1
l(d)p +
e∈E0
p
�
l(d)p (by (3.4))
e∈E1
l(e) .
e∈E(N )
To see the final estimate given in the statement of the theorem, note that the left side of (3.3) is at least
2n−1 ·l(dmin )p and the right side of (3.3) is at most n·2n−1 ·l(emax )p . Thus 2n−1 ·l(dmin )p ≤ n·2n−1 ·l(emax )p .
�
Taking roots gives the desired estimate.
Remark 3.13. Using the second estimate given in the statement of Theorem 3.12 to great effect, Enflo [3]
was able to show that the classical Banach spaces �p and �q are not uniformly homeomorphic if 1 ≤ p < q ≤ 2.
4. Negative type and generalized roundness
Notions of negative type and generalized roundness were formally introduced and studied by Menger [13],
Schoenberg [15, 16, 17] and Enflo [4], respectively. Menger and Schoenberg were interested in determining
which metric spaces can be isometrically embedded into a Hilbert spaces.6 Enflo’s interest, on the other
hand, was to construct a separable metric space that admits no uniform embedding into any Hilbert space.
Rather surprisingly, as we shall see momentarily, the notions of negative type and generalized roundness are
actually equivalent. However, we first need to put some formal definitions in place before proceeding.
Definition 4.1. Let p ≥ 0 and let (X, d) be a metric space. Then:
(a) (X, d) has p-negative type if and only if for all finite subsets {x1 , . . . , xn } ⊆ X (n ≥ 2) and all
choices of real numbers η1 , . . . , ηn with η1 + · · · + ηn = 0, we have:
�
d(xi , xj )p ηi ηj
≤
1≤i,j≤n
0.
(b) (X, d) has strict p-negative type if and only if it has p-negative type and the inequality in (a) is
strict whenever the scalar n-tuple (η1 , . . . , ηn ) �= (0, . . . , 0).
(c) We say that p is a generalized roundness exponent for (X, d), denoted by p ∈ gr(X) or p ∈ gr(X, d),
if and only if for all natural numbers n, and all choices of points a1 , . . . , an , b1 , . . . , bn ∈ X, we have:
�
1≤k<l≤n
{d(ak , al )p + d(bk , bl )p }
≤
�
d(aj , bi )p .
(4.1)
1≤j,i≤n
(d) We say that p is a strict generalized roundness exponent for (X, d) if and only if the inequality in
(c) is always strict.
6The explicit study of isometric embeddings into Hilbert space first emerged in the 1800s in the work of Cayley [1].
15
(e) The (maximal) generalized roundness of (X, d), denoted by q(X) or q(X, d), is defined to be the
supremum of the set of all generalized roundness exponents of (X, d):
q(X, d)
=
sup{p : p ∈ gr(X, d)}.
Remark 4.2. We collect here some important basic commentary on Definition 4.1.
(i) In making Definition 4.1 (c) it is important to point out that repetitions among the a’s and b’s are
allowed. Indeed, allowing repetitions is essential. We may, however, when making Definition 4.1
(c), assume that aj �= bi for all i, j (1 ≤ i, j ≤ n). This is due to an elementary cancellation of like
terms.
(ii) There are n(n − 1) terms on the left side and n2 terms on the right side of the inequality (4.1) in
Definition 4.1 (c).
(iii) When p = 0, terms of the form 00 may appear in some of the inequalities of Definition 4.1. In light
of this possibility we adopt the convention 00 = 0. There are compelling technical reasons why this
is the correct convention in this setting.
(iv) Every metric space (X, d) has 0-negative type and 0 ∈ gr(X, d). The proof is an easy exercise.
(v) Every roundness inequality (3.1) can be expresssed in the form of a generalized roundness inequality
(4.1) with the same exponent. The proof is simple. Consider a quadruple of points xij . Just take
n = 2 in Definition 4.1 (c) and set a1 = x00 , a2 = x11 , b1 = x01 and b2 = x10 . The implication
is then plainly evident. Hence, for any given metric space (X, d), generalized roundness cannot
exceed roundness: q(X, d) ≤ p(X, d).
(vi) The inequality q(X, d) ≤ p(X, d) can be strict. For example, q(�∞ ) = 0 while p(�∞ ) = 1. In fact,
the situation can be even worse in some cases. There exist metric spaces (X, d) such that q(X) = 0
and p(X) = 2. (Certain so-called Cat(0)-spaces in hyperbolic geometry are so inclined.)
(vii) It is even possible for the maximal generalized roundness of a finite dimensional Banach space to
be zero. On the other hand, for every finite metric space (X, d), q(X) > 0. (More on this shortly.)
(3)
Problem 4.3. Prove that q(�∞ ) = 0 while p(�∞ ) = 1. (It is even the case that q(�∞ ) = 0 but no explicit
proof is known to date.)
Fundamental to these notes is the case of Hilbert space.
(n)
Lemma 4.4. The Hilbert space �2
has 2-negative type. In fact, every Hilbert space H has 2-negative type.
Proof. This follows from the calculations (2.3) and (2.7) in the proofs of Theorem 2.3 and Remark
�
2.4.
The following problem justifies the use of the word “maximal” in Definition 4.1 (e).
Problem 4.5. Let (X, d) be a given metric space. Prove that gr(X, d) is a closed subset of [0, ∞). Deduce
that if q(X) < ∞, then q(X) is a generalized roundness exponent of (X, d).
As mentioned at the outset of this section, the classical interest in the 2-negative type condition arose out
of efforts to characterize those metric spaces that admit an isometric embedding into a Hilbert space. The
definitive result in this direction was obtained by Schoenberg [16].
Theorem 4.6 (Schoenberg [16]). A metric space (X, d) can be isometrically embedded in (some) Hilbert
space if and only if it has 2-negative type.
16
Proof. For a finite metric space (X, d) this is just Theorem 2.3. The general case involves an application
�
of Zorn’s Lemma and is omitted.
A major surprise is that conditions (a) and (c) of Definition 4.1 are actually equivalent.
Theorem 4.7. Let p ≥ 0 and let (X, d) be a metric space. Then the following conditions are equivalent:
(a) (X, d) has p-negative type.
(b) For all s, t ∈ N, all choices of pairwise distinct points a1 , . . . , as , b1 , . . . , bt ∈ X and all choices of
real numbers m1 , . . . , ms , n1 , . . . , nt > 0 such that m1 + · · · + ms = 1 = n1 + · · · + nt , we have:
�
mj1 mj2 d(aj1 , aj2 )
p
�
+
1≤j1 <j2 ≤s
ni1 ni2 d(bi1 , bi2 )
1≤i1 <i2 ≤t
p
s,t
�
≤
mj ni d(aj , bi )p .
(4.2)
j,i=1
(c) p is a generalized roundness exponent for (X, d).
Proof. Suppose (b) holds. Consider a subset {x1 , . . . , xn } ⊆ X (n ≥ 2) together with real numbers
η1 , . . . , ηn such that η1 + · · · + ηn = 0. To avoid a triviality we assume that (η1 , . . . , ηn ) �= (0, . . . , 0). By
relabeling (if necessary) we may assume there exist natural numbers s, t ∈ N such that s+t = n, η1 , . . . , ηs ≥ 0
and ηs+1 , . . . , ηn < 0. Now make the following designations:
s
n
s
n
�
�
�
�
(i) As
ηj = −
ηk we may define α =
ηj = −
ηk =
j=1
j=1
k=s+1
k=s+1
1
2
n
�
�=1
|η� | > 0.
(ii) For 1 ≤ j ≤ s, set aj = xj and mj = ηj /α. And for 1 ≤ i ≤ t, set bi = xn−i+1 and ni = −ηn−i+1 /α.
By construction; a1 , . . . , as , b1 , . . . , bt are distinct points in X, m1 , . . . , ms , n1 , . . . , nt > 0 and m1 +· · ·+ms =
1 = n1 + · · · + nt . Applying (4.2), we see that
�
�
0≥2
mj1 mj2 d(aji , aj2 )p +
1≤j1 <j2 ≤s
=
1
·
α2
�
�
1≤i1 <i2 ≤t
ni1 ni2 d(bi1 , bi2 )p −
s,t
�
mj ni d(aj , bi )p
j,i=1
�
(4.3)
d(xi , xj )p ηi ηj .
1≤i,j≤n
Hence (a) holds. The process just described is clearly reversible. Thus (a) implies (b) too. It is also plain
that (b) implies (c). The proof that (c) implies (b) uses Remark 4.2 (i) and appears in Lennard et al.
�
[12].
(n)
Corollary 4.8. The Hilbert space �2
has maximal generalized roundness 2. In fact, every Hilbert space H
has maximal generalized roundness 2.
(n)
(n)
(n)
Proof. We know that q(�2 ) ≤ p(�2 ) = 2 by Remark 4.2 (iv) and Lemma 3.9. Additionally, q(�2 ) ≥
(n)
2 by Lemma 4.4 and Theorem 4.7. Thus q(�2 ) = 2, as asserted.
�
4.1. The generalized roundness of spherically symmetric trees. Recall that given a connected
graph G the usual path distance between two vertices v and w is the smallest number of edges in any path
joining v to w. We shall call this the (ordinary or unweighted) path metric ρ on G. Many planar graphs
(G, ρ), including all trees (≡ connected graphs with no cycles), are known to have generalized roundness at
least one. In what follows, we shall denote the degree of a vertex v in a graph G by deg(v).
The connected graphs of interest in this subsection are trees with at most countably many vertices
and where each vertex is of finite degree. The condition on the degrees of the vertices is not a significant
restriction as it follows immediately from [2, Theorem 5.6] that if a tree T has a vertex of infinite degree,
then q(T, ρ) = 1. For brevity we shall use the term countable tree rather than countably infinite tree.
17
Many of our definitions will depend on the choice of a distinguished root vertex. We shall denote a tree
T with metric δ and root v0 by (T, δ, v0 ). If v0 or δ are clear in a given setting, we may simply write (T, δ) or
T . For a rooted metric tree (T, δ, v0 ) we shall let r(T ) = r(T, v0 ) denote the v0 -radius of T . In other words,
r(T, v0 ) is the maximal value of δ(v0 , v) if that quantity is finite and ∞ otherwise.
Our definition of spherically symmetric trees is based on the one in [10].
Definition 4.9. Let T be a tree endowed with the path metric ρ. We shall say that T is spherically
symmetric if we can choose a root vertex v0 ∈ T so that
ρ(v0 , v1 ) = ρ(v0 , v2 ) ⇒ deg(v1 ) = deg(v2 ) for all v1 , v2 ∈ T.
Such a triple (T, ρ, v0 ) will be called a spherically symmetric tree (SST).
The following terminology and notation is helpful for describing SSTs. Given two distinct vertices v and w
in a tree T with a designated root v0 , we shall say that w is a descendant of v, and that v is an ancestor of
w, if v lies on the geodesic joining w and v0 . A descendant w of v for which ρ(v, w) = 1 is called a child of
v. We shall denote by d↓ (v) the number of children of the vertex v. A leaf is a vertex v of degree one.
Now suppose that (T, ρ, v0 ) is a SST. Clearly, d↓ (v) depends only on ρ(v0 , v). For 0 ≤ k < r(T ), let dk
denote the integer such that d↓ (v) = dk whenever ρ(v0 , v) = k. The downward degree sequence of T is the
finite or countable sequence
d↓ (T ) = d↓ (T, v0 ) = (dk )0≤k<r(T ) .
Clearly one can reconstruct a SST directly from its downward degree sequence. It is worth noting however
that a countable tree can be a SST with respect to different root vertices. For example, the trees with
downward degree sequences (2, 2, 1, 2, 1, 2, 1, 2, . . . ) and (3, 1, 2, 1, 2, 1, 2, . . . ) are graph isomorphic.
Example 4.10. The SST of radius 2 and downward degree sequence (3, 2).
Example 4.11. Infinite combs such as this one are not SSTs.
...
1
2
3
4
By [2, Theorem 5.4], each finite metric tree has generalized roundness strictly greater than one. Since
Definition 4.1 (c) is predicated in terms of finite collections of points from the underlying metric space, it
follows that each countable metric tree has generalized roundness at least one. In fact, an explicit (nontrivial) lower bound on the generalized roundness of a finite metric tree can always be written down. In the
particular case of a finite SST (T, ρ) it follows from Corollary 6.7 that the generalized roundness q(T ) of
(T, ρ) is at least
� �
��
�
1
1 + ln 1 +
ln 2r(T ) ,
(4.4)
2r(T ) · (m − 1) · ϕ(m)
18
where m = |T | and ϕ(m) = 1 −
1
2
�
−1
−1 �
· �m
+ �m
. Bounds such as this can be derived using nothing
2�
2�
more complicated than the method of Lagrange multipliers.
We now show how to obtain an upper bound on the generalized roundness of a finite SST (T, ρ) that
behaves well if we let m = |T | → ∞.
Theorem 4.12. Let (T, ρ) be a finite SST with radius n = r(T ) ≥ 3 and with downward degree sequence
(d0 , d1 , . . . , dn−1 ). Then the generalized roundness q(T ) of (T, ρ) must satisfy
�
�
ln 2 + d0 ···d2 k −1
�
� ,
q(T ) < min
ln 2 − 2k
n
(4.5)
where the minimum is taken over all natural numbers k such that 1 ≤ k < n/2 and d0 · · · dk > 1. If no such
natural numbers k exist, then q(T ) ≤ 2 (with examples to show that equality is possible).
Proof. Let (T, ρ) be as in the statement of the theorem. Let v0 denote the root of T and set p = q(T ).
Consider an arbitrary natural number k such that 1 ≤ k < n/2 and d0 · · · dk > 1. There are d0 d1 · · · dk−1
vertices at distance k from v0 . For each of the dk children of such a vertex, choose a leaf which is a
descendant of that child. This results in a total of q = d0 d1 · · · dk−1 dk distinct leaves which we label
a1 , . . . , aq . By the construction, the geodesic joining any two distinct leaves in this list must pass through at
least one of the vertices at distance k from v0 . Consequently ρ(ai , aj ) ≥ 2(n − k) whenever i �= j. Now set
b1 = b2 = · · · bq = v0 and consider the corresponding generalized roundness inequality (4.1). The left side of
this inequality is greater than q(q − 1)(2(n − k))p /2 and the right side is exactly q 2 np , so we see that p must
satisfy the weaker inequality q(q − 1)(2(n − k))p /2 < q 2 np . On taking logarithms, elementary rearrangement
shows that
q(T )
<
�
�
ln 2 + d0 ···d2 k −1
�
� .
ln 2 − 2k
n
(4.6)
As (4.6) holds for all k such that 1 ≤ k < n/2 and d0 · · · dk > 1, the main assertion of the theorem is evident.
In the event that no such natural numbers k exist, the second assertion of the theorem follows from the
routine observation that the SST with degree sequence (1, 1) has generalized roundness two.
�
While the upper bound given in Theorem 4.12 is far from optimal, it is good enough to prove the following
result.
Theorem 4.13. Let (T, ρ) be a countable SST with degree sequence (d0 , d1 , d2 , . . .). If |{j | dj > 1}| = ℵ0 ,
then (T, ρ) has generalized roundness one.
Proof. Let (T, ρ) be a given SST as in the statement of the theorem. The condition on (d0 , d1 , d2 , . . .)
ensures that d1 d2 · · · dk → ∞ as k → ∞. Moreover, by considering the truncations of T , it is clear that
q(T ) satisfies (4.6) for all n ≥ 3 and all k < n/2. Now set k = �ln n� (for example) and let n → ∞. We see
immediately that q(T ) ≤ 1. But, as we noted at the outset of this section, q(T ) ≥ 1 too. Thus, q(T ) = 1.
�
One immediate consequence of Theorem 4.13 is the following observation.
Corollary 4.14. There exists an uncountable number of mutually non-isomorphic countable SSTs of generalized roundness one.
Proof. Consider the set of degree sequences D = {(1, d1 , d2 , d3 , . . . ) : 1 < d1 < d2 < d3 < . . . }. This
set is clearly uncountable and, by Theorem 4.13, the SST corresponding to each element of D has generalized
roundness one. Suppose that (1, d1 , d2 , . . . ) and (1, d�1 , d�2 , . . . ) are distinct elements of D, and let T and T �
19
be the corresponding SSTs. Let k be the smallest integer such that dk �= d�k where, without loss, we may
assume that dk < d�k . Then T has vertices with degree dk + 1, but T � does not, and so the two graphs are
�
not isomorphic.
4.2. Asymmetric Trees of Generalized Roundness One. In this subsection we consider countable
trees S, endowed with the usual path metric ρ, that have at most finitely many leaves and develop the natural
analogues of Theorems 4.12 and 4.13. This is done via the statement and proof of Theorem 4.17. The main
idea is that one can relax the hypotheses of the last section as long as the tree S ‘spreads out’ at a suitable
rate. In particular, we will completely dispense with the requirement of spherical symmetry.
Recall that any countable tree (S, ρ) that has a vertex v of degree ℵ0 trivially has generalized roundness
one and so we will exclude such trees from the subsequent discussion.
Definition 4.15. Let (S, ρ, v0 ) be a countable rooted tree. A vertex v ∈ S is said to be infinitely bifurcating
if it has infinitely many descendants with vertex degree greater than or equal to 3.
Note that every ancestor of an infinitely bifurcating vertex is clearly also infinitely bifurcating, and every
infinitely bifurcating vertex has at least one child which is infinitely bifurcating.
Lemma 4.16. Let (S, ρ, v0 ) be a countable rooted tree. Then the following are equivalent.
(1) v0 is an infinitely bifurcating vertex.
(2) There exists an infinitely bifurcating vertex v ∈ S.
(3) There exist infinitely many bifurcating vertices in S.
(4) There exist infinitely many vertices in S with degree at least 3.
(5) There exists a radial geodesic path (v0 , v1 , v2 , . . . ) which contains infinitely many vertices with degree
at least 3.
Proof. Most of the equivalences are trivial, or else easy consequences of the comments before the
lemma. To see that (1) implies (5), one may recursively construct a path by, for each j ≥ 0, choosing vj+1
to be a child of vj which is also infinitely bifurcating. If this path contained only finitely many vertices of
degree at least 3 there would be an integer J so that deg(vj ) = 2 for all j ≥ J, and this would contradict
�
the fact that vJ is infinitely bifurcating.
A tree (S, ρ, v0 ) that satisfies any one (and hence all) of the conditions (1) through (5) of Lemma 4.16 will
be said to be an infinitely bifurcating tree.
Theorem 4.17. Let (S, ρ, v0 ) be a (countable) infinitely bifurcating tree with only finitely many leaves. Then
(S, ρ, v0 ) has generalized roundness one.
Proof. Let (S, ρ, v0 ) be a countable tree that satisfies the hypotheses of the theorem. Let R =
max{ρ(v0 , v) : v is a leaf}. Since only finitely many vertices satisfy ρ(v0 , v) ≤ R we can choose a vertex w with ρ(v0 , w) = R + 1. The tree consisting of w and all its descendants is then a subtree of S which
has no leaves and which still has an infinite number of vertices with degree at least 3.
It suffices to assume therefore that (S, ρ, v0 ) has no leaves. The proof now proceeds by modifying the
arguments used to establish Theorems 4.12 and 4.13 together with the new structural ingredient of infinite
bifurcation.
Since S is infinitely bifurcating we can choose a radial geodesic path P = (v0 , v1 , v2 , . . . ) which contains
infinitely many vertices with degree at least 3. Starting at v1 (and preserving the original order), denote the
vertices in P with degree at least 3 by c1 , c2 , c3 , . . . .
20
Let n ≥ 2 be a given natural number and let m = ρ(v0 , cn ). Clearly m ≥ n. Suppose that 1 ≤ j ≤ n.
Since cj has degree at least 3, it has a child which does not lie on P . Since S has no leaves, we may choose
a descendant of this child, which we shall denote aj , with ρ(v0 , aj ) = m2 + m. If we now fix bj = v0 for
1 ≤ j ≤ n, we see that ρ(ai , bj ) = m + m2 for all i and j, while ρ(ai , aj ) ≥ 2m2 whenever i �= j (1 ≤ i, j ≤ n).
Let p = q(S). It follows (by a similar argument to that given in the proof of Theorem 4.12) that we
must have
n(n − 1)(2m2 )p ≤ 2n2 (m + m2 )p .
Thus
p
≤
�
�
2
ln 2 + n−1
�
�.
2
ln 2 − m+1
(4.7)
If we now let n (and therefore m) → ∞ in (4.7) we see that p ≤ 1.
�
Remark 4.18. The hypothesis that an infinitely bifurcating tree (S, ρ, v0 ) have only finitely many leaves is
clearly not necessary in order that q(S) = 1. We have already noted that if a countable tree has vertex of
degree ℵ0 , then its generalized roundness must be one. It is also clear from the estimates in this subsection
that if a countable tree includes vertices of arbitrarily high degree, then its generalized roundness must be
one. It follows from either of these observations that a countable metric tree with a countable number of
leaves can have generalized roundness one. The following example from [2] exhibits this type of behavior
nicely. Let n ≥ 2 be a natural number. Let Yn denote the unique tree with n + 1 vertices and n leaves.
In other words, Yn consists of an internal vertex, which we will denote rn , surrounded by n leaves. We
endow Yn with the unweighted path metric ρ as per usual. The generalized roundness of (Yn , ρ) is computed
explicitly in [2, Theorem 5.6]:
q(Yn ) = 1 +
�
ln 1 +
1
n−1
�
.
ln 2
We now form a countable tree Y as follows: For each natural number n ≥ 2 connect Yn to Yn+1 by introducing
a new edge which connects the internal node rn of Yn to the internal node rn+1 of Yn+1 . We may further
endow Y with the unweighted path metric ρ. The countable metric tree (Y, ρ) has countably many leaves and
it clearly has generalized roundness one. However, the proof of Theorem 4.17 does not apply to (Y, ρ). On
the other hand it is easy to construct a countable tree with only finitely many leaves which has generalized
roundness 2. For example, the natural numbers with their usual metric will suffice.
These comments mean that, in spite of Theorem 4.17, the exact role that the number of leaves plays
in determining the generalized roundness of an infinitely bifurcating tree is still not entirely clear. The
interesting open question concerns the generalized roundness infinitely bifurcating trees (with infinitely
many leaves) where the degrees of the vertices are bounded.
5. Enhanced and strict p-negative type
Condition (b) of Theorem 4.7 has important technical ramifications. For example, it allows the use of
methods of constrained optimization — such as Lagrange’s Multiplier Theorem — to help determine a theory
of (strict) p-negative type metrics. Therefore, to help streamline the exposition throughout the remainder
of this document, we use condition (b) of Theorem 4.7 to motivate the following technical definition.
Definition 5.1. Let s, t be arbitrary natural numbers and let X be any set.
(a) An (s, t)-simplex in X is an (s + t)-vector (a1 , . . . , as , b1 , . . . , bt ) ∈ X s+t consisting of s + t pairwise
distinct coordinates a1 , . . . , as , b1 , . . . , bt ∈ X. Such a simplex will be denoted by D = [aj ; bi ]s,t .
21
(b) A load vector for an (s, t)-simplex D = [aj ; bi ]s,t ⊆ X is a vector ω = (m1 , . . . ms , n1 , . . . , nt ) ∈ Rs+t
+
that assigns a positive weight mj > 0 or ni > 0 to each vertex aj or bi of D, respectively.
(c) A loaded (s, t)-simplex in X consists of an (s, t)-simplex D = [aj ; bi ]s,t ⊆ X together with a load
vector ω = (m1 , . . . , ms , n1 , . . . , nt ) for D. Such a loaded simplex will be denoted by D(ω) or
[aj (mj ); bi (ni )]s,t as the need arises.
(d) A normalized (s, t)-simplex in X is simply a loaded (s, t)-simplex D(ω) in X whose load vector
ω = (m1 , . . . , ms , n1 , . . . , nt ) satisfies the two normalizations m1 + · · · + ms = 1 = n1 + · · · nt . Such
a vector ω will be called a normalized load vector for D.
The following strict variant of Theorem 4.7 evidently holds.
Theorem 5.2. Let p ≥ 0 and let (X, d) be a metric space. Then the following conditions are equivalent:
(a) (X, d) has strict p-negative type.
(b) For all s, t ∈ N and all normalized (s, t)-simplices D(ω) = [aj (mj ); bi (ni )]s,t in X we have:
�
mj1 mj2 d(aj1 , aj2 )p +
1≤j1 <j2 ≤s
�
ni1 ni2 d(bi1 , bi2 )p
s,t
�
<
1≤i1 <i2 ≤t
mj ni d(aj , bi )p .
j,i=1
(c) p is a strict generalized roundness exponent for (X, d).
Proof. The argument proceeds in exactly the same way as the proof of Theorem 4.7. The only adjust�
ment is to replace ≥ with > in (4.3).
The characterization of strict p-negative type given by condition (b) of Theorem 5.2 suggests a new
theoretical approach to considerations of (strict) p-negative type. The main idea is encapsulated in the two
following definitions.
Definition 5.3. Let p ≥ 0 and (X, d) be a metric space. Let s, t be natural numbers and D = [aj ; bi ]s,t be an
(s, t)-simplex in X. Denote by Ns,t the set of all normalized load vectors ω = (m1 , . . . , ms , n1 , . . . , nt ) ⊂ Rs+t
+
p
for D. Then the (normalized) p-negative type simplex gap of D is defined to be the function γD
: Ns,t → R
where
p
γD
(ω)
=
s,t
�
j,i=1
mj ni d(aj , bi )p −
�
1≤j1 <j2 ≤s
mj1 mj2 d(aj1 , aj2 )p −
�
ni1 ni2 d(bi1 , bi2 )p
1≤i1 <i2 ≤t
for each ω = (m1 , . . . , ms , n1 , . . . , nt ) ∈ Ns,t .
p
Notice that γD
(ω) is just taking the difference between the right-hand side and the left-hand side of the
p
inequality (4.2). So, by Theorem 5.2, (X, d) has strict p-negative type if and only if γD
(ω) > 0 for each
normalized (s, t)-simplex D(ω) in X.
Definition 5.4. Let p ≥ 0. Let (X, d) be a metric space with p-negative type. We define the (normalized)
p-negative type gap of (X, d) to be the non-negative quantity
p
ΓpX = inf γD
(ω)
D(ω )
where the infimum is taken over all normalized (s, t)-simplices D(ω) in X.
22
Remark 5.5. Suppose (X, d) is a metric space with p-negative type for some p ≥ 0. There are two ways in
which we may view the parameter Γ = ΓpX . By definition, Γ is the largest non-negative constant so that
Γ+
�
�
mj1 mj2 d(aj1 , aj2 )p +
1≤j1 <j2 ≤s
ni1 ni2 d(bi1 , bi2 )p
1≤i1 <i2 ≤t
s,t
�
≤
mj ni d(aj , bi )p
(5.1)
j,i=1
for all normalized (s, t)-simplices D(ω) = [aj (mj ); bi (ni )]s,t in X. Alternatively, Γ is the largest non-negative
constant so that
Γ
2
�
n
�
�=1
|η� |
�2
�
+
d(xi , xj )p ηi ηj
1≤i,j≤n
≤
0
(5.2)
for all natural numbers n ≥ 2, all finite subsets {x1 , . . . , xn } ⊆ X, and all choices of real numbers η1 , . . . , ηn
with η1 + · · · + ηn = 0. The fact that Γ is scaled on the left-hand side of (5.2) simply reflects that the classical
p-negative type inequalities are not (by definition) normalized whereas the generalized roundness inequalities
are normalized. The second assertion (5.2) is less obvious and so we will prove it here as a theorem. The
proof proceeds similarly to that of Theorem 4.7.
Theorem 5.6. Let (X, d) is a metric space with p-negative type for some p ≥ 0. Let Γ = ΓpX denote the
p-negative type gap of (X, d). Then
� n
�2
Γ �
|η� | +
2
�=1
�
d(xi , xj )p ηi ηj
1≤i,j≤n
≤
0
for all natural numbers n ≥ 2, all finite subsets {x1 , . . . , xn } ⊆ X, and all choices of real numbers η1 , . . . , ηn
with η1 + · · · + ηn = 0.
Proof. Let n ≥ 2. Consider an arbitrary subset {x1 , . . . , xn } ⊆ X together with real numbers η1 , . . . , ηn
such that η1 + · · · + ηn = 0. To avoid a triviality we assume that (η1 , . . . , ηn ) �= (0, . . . , 0). By relabeling
(if necessary) we may assume there exist natural numbers s, t ∈ N such that s + t = n, η1 , . . . , ηs ≥ 0 and
ηs+1 , . . . , ηn < 0. Now make the following designations:
s
n
s
n
�
�
�
�
(i) As
ηj = −
ηk we may define α =
ηj = −
ηk =
j=1
k=s+1
j=1
k=s+1
1
2
n
�
�=1
|η� | > 0.
(ii) For 1 ≤ j ≤ s, set aj = xj and mj = ηj /α. And for 1 ≤ i ≤ t, set bi = xn−i+1 and ni = −ηn−i+1 /α.
By construction; a1 , . . . , as , b1 , . . . , bt are distinct points in X, m1 , . . . , ms , n1 , . . . , nt > 0 and m1 +· · ·+ms =
1 = n1 + · · · + nt . Thus D = [aj (mj ); bi (ni )]s,t is a normalized (s, t)-simplex in X. Now observe that
�
1
·
d(xi , xj )p ηi ηj
α2
1≤i,j≤n
�
�
s,t
�
�
�
p
p
p
=2
mj1 mj2 d(aji , aj2 ) +
ni1 ni2 d(bi1 , bi2 ) −
mj ni d(aj , bi )
1≤j1 <j2 ≤s
≤ −2Γ
by (5.2). Hence, 2Γα2 +
1≤i1 <i2 ≤t
�
1≤i,j≤n
j,i=1
d(xi , xj )p ηi ηj ≤ 0, which is the desired outcome by definition of α.
�
Even for a relatively simple metric space (X, d) (such as a finite metric tree) computing the p-negative type
gap ΓpX for suitable values of p can be a daunting non-linear task. The following subsection gives some idea
of the complexities that may arise but it also gives an idea why working with normalized (s, t)-simplices can
be fruitful.
23
5.1. Determining the 1-negative type simplex gap of a finite metric tree. Hjorth et al. [8]
have shown that finite metric trees have strict 1-negative type. The purpose of this subsection is to recover
1
this result by developing a formula for the 1-negative type simplex gap γD
of an arbitrary (s, t)-simplex D in
a finite metric tree (T, d). This is done in Theorem 5.16. Prior to doing this, however, it is highly germane
to review some basic facts and standard notations pertaining to finite metric trees. We will also introduce
some concepts and notations that are less standard.
Definition 5.7. A finite metric tree is a finite connected graph T that has no cycles, endowed with an edge
weighted path metric d. Terminal vertices in T are called leaves or pendants. Given vertices x, y ∈ T , the
unique shortest path from x to y is called a geodesic and is denoted [x, y]. In particular, the pair e = (x, y)
is an edge in T if and only if the geodesic [x, y] from x to y contains no other vertices of T . If an edge e lies
on a geodesic [x, y], we may sometimes write e ⊆ [x, y].
Notation 5.8. Given an edge e = (x, y) in a finite metric tree (T, d) we will often find it convenient use the
notation |e| = d(x, y) to denote the metric length of the edge.
Definition 5.9. Let (T, d) be a finite metric tree.
(a) If |e| = 1 for all edges e = (x, y) in (T, d) we will say that the path metric d is ordinary or
unweighted.7
(b) More generally; if |e| =
� 1 for at least one edge e = (x, y) in (T, d), we will say that the path metric
d is edge weighted.
Problem 5.10. Prove that every finite metric tree has (maximal) roundness 2.
Definition 5.11. Given a finite metric tree (T, d) and a set of vertices V ⊆ T we can form the smallest
subtree of T that contains all the vertices of V — denoted by TV — and we can endow it with the natural
restriction of the metric d. We will call (TV , d) the minimal subtree of (T, d) generated by the set of vertices
V . Clearly: if V = {v1 , . . . , vk } ⊆ T then the minimal subtree TV consists of all vertices x ∈ T that lie on
some geodesic [vi , vj ] in T . Of course, the minimal subtree (TV , d) is a finite metric tree in its own right.
Given a subset V ⊆ T it is also clear that TV = T if and only if V contains all the leaves of T .
The following definition introduces a convention to “orient” the edges in any given tree. This will enable
the treatment of edges as ordered pairs in a systematic and unambiguous way. Orientation will play a key
rôle in determining the main result of this subsection (Corollary 5.17).
Definition 5.12. Let (T, d) be a finite metric tree. By way of convention, we choose and then highlight a
fixed leaf � ∈ T . This distinguished leaf � is then called the root of T . Once the root has been fixed we may
make the following definitions.
(a) An edge e = (x, y) in T is (left/right) oriented if d(x, �) > d(y, �). In other words, an oriented edge
in T is an ordered pair e = (x, y) of adjacent vertices x, y ∈ T where x is geodesically further from
the root � than y. The set of all such oriented edges e in T will be denoted E(T ).
(b) A vertex v ∈ T is to the left of an oriented edge e = (x, y) ∈ E(T ) if d(v, x) < d(v, y). If it is also
the case that v �= x then we will say that v is strictly to the left of e. The set of all vertices v ∈ T
that are to the left of e will be denoted L(e). And the set of all vertices v ∈ T that are strictly
to the left of e will be denoted L(e). Notice that we always have x ∈ L(e) but it can happen that
7In this case, d is usually denoted by ρ, as in the earlier sections of these notes. See Example 1.3.
24
L(e) = ∅. We may think of L(e) as the vertices of the subtree that is rooted at x and oriented as
per T .
(c) A vertex v ∈ T is to the right of an oriented edge e = (x, y) ∈ E(T ) if d(v, y) < d(v, x). If it is also
the case that v �= y then we will say that v is strictly to the right of e. The set of all vertices v ∈ T
that are to the right of e will be denoted R(e). And the set of all vertices v ∈ T that are strictly
to the right of e will be denoted R(e).
Notice that each oriented edge e ∈ E(T ) partitions the vertices of T into a disjoint union L(e) ∪ R(e).
Henceforth, whenever we are referring to a particular finite metric tree, it will be understood that a root
leaf has been chosen from the outset. So “edges” are now always ordered pairs e = (x, y) with the left vertex
x as the first coordinate and the right vertex y as the second coordinate. In particular, orientation affords
the following compact notation.
Notation 5.13. Given an oriented edge e = (x, y) in a finite metric tree (T, d) we may use its unique left
vertex x to alternately denote the edge as e(x). Note that, under this scheme, e(�) is not defined because
the root leaf � is not the left vertex of any oriented edge. All other vertices in T appear (uniquely) as the
left vertex of some oriented edge.
Definition 5.14. Let D = [aj ; bi ]s,t be a fixed (s, t)-simplex in a finite metric tree (T, d). Let TD be the
minimal subtree of T generated by the vertices aj , bi of D. Orient the edges of TD by fixing a root leaf
� ∈ TD . For each oriented edge e ∈ E(TD ) and each load vector ω = (m1 , . . . , ms , n1 , . . . , nt ) ∈ Rs+t
for D,
+
we define the following partition sums of ω:
�
(a) αL (ω, e) =
mj where AL (e) = {j ∈ [s] : aj ∈ L(e)}.
j∈AL (e)
�
(b) αR (ω, e) =
mj where AR (e) = {j ∈ [s] : aj ∈ R(e)}.
j∈AR (e)
�
(c) βL (ω, e) =
ni where BL (e) = {i ∈ [t] : bi ∈ L(e)}.
i∈BL (e)
�
(d) βR (ω, e) =
ni where BR (e) = {i ∈ [t] : bi ∈ R(e)}.
i∈BR (e)
If, in the above definitions, we replace L(e) and R(e) with L(e) and R(e) (respectively), then we obtain
the strict partition sums of ω: αL (ω, e), αR (ω, e), β L (ω, e) and β R (ω, e). For example:
�
(e) αL (ω, e) = {mj : aj ∈ L(e)}.
�
(f) β L (ω, e) = {ni : bi ∈ L(e)}.
Notice that if the load vector ω is normalized, then we obtain the innocuous looking (but important)
identities αL (ω, e) + αR (ω, e) = 1 = βL (ω, e) + βR (ω, e).
Notation 5.15. In relation to Definition 5.14, if we want to emphasize the (fixed) underlying (s, t)-simplex
D, we may sometimes write αL (D, ω, e) in place of αL (ω, e), and so on. While this notation may seem
cumbersome it actually allows the efficient statement and proof of some important theorems such as the
following.
Theorem 5.16. Let D = [aj ; bi ]s,t be a given (s, t)-simplex in a finite metric tree (T, d). Let TD denote the
minimal subtree of T generated by the vertices of D. Let Ns,t ⊂ Rs+t
denote the set of all normalized load
+
1
vectors for D. Let γD denote the 1-negative type simplex gap γD
: Ns,t → R. Then, for each such normalized
25
load vector ω = (m1 , . . . , ms , n1 , . . . , nt ) ∈ Ns,t , the evaluation γD (ω) is given by the following formulas:
�
γD (ω) =
(αL (ω, e) − βL (ω, e))2 · |e|
e∈E(TD )
=
�
e∈E(TD )
(αR (ω, e) − βR (ω, e))2 · |e|.
In particular it follows that the simplex gap functions γD : Ns,t → R are positive valued for all possible
(s, t)-simplexes D ⊆ T .
Proof. Fix a normalized load vector ω = (m1 , . . . , ms , n1 , . . . , nt ) for the given (s, t)-simplex D =
[aj ; bi ]s,t . The idea of the proof is to calculate the contribution of each oriented edge e ∈ E(TD ) to the
simplex gap evaluation γD (ω), and then to sum over all such oriented edges.
As per Definition 5.3, γD (ω) = RD (ω) − LD (ω), where
�
LD (ω) =
mj1 mj2 d(aj1 , aj2 ) +
1≤j1 <j2 ≤s
RD (ω)
s,t
�
=
�
ni1 ni2 d(bi1 , bi2 ), and
1≤i1 <i2 ≤t
mj ni d(aj , bi ).
j,i=1
Notice that if [x, y] is a geodesic in the minimal subtree TD , then:
��
�
d(x, y) =
|f | : f ∈ E(TD ) and f ⊆ [x, y] .
(5.3)
This is because (TD , d) is a metric tree. Due to the geodesic decompositions (5.3) we may therefore rewrite
the sums LD (ω) and RD (ω) as
LD (ω) =
�
e∈E(TD )
where the coefficients
(e)
LD (ω)
and
(e)
LD (ω) · |e|, and RD (ω) =
(e)
RD (ω)
�
e∈E(TD )
(e)
RD (ω) · |e|,
are yet to be determined.
Now consider a fixed oriented edge e ∈ E(TD ). Notice that if the edge e lies on the geodesic [aj1 , aj2 ]
then the term mj1 mj2 · |e| appears in the sum LD (ω) (and so on). For this to happen, aj1 must be to the
left of e (that is, j1 ∈ AL (e)) and aj2 must be to the right of e (that is, j2 ∈ AR (e)) or, vice versa. This and
similar such comments, together with the definitions of LD (ω) and RD (ω), imply:


 

�
�
�
�
(e)
LD (ω) = 
mj 1  
mj2  + 
n i1  
j1 ∈AL (e)
(e)
RD (ω)
j2 ∈AR (e)
i1 ∈BL (e)
i2 ∈BR (e)
=
αL (ω, e) · αR (ω, e) + βL (ω, e) · βR (ω, e)
=
αL (ω, e) · (1 − αL (ω, e)) + βL (ω, e) · (1 − βL (ω, e)), and


 


�
�
�
�

mj  
ni  + 
mj  
ni 
=
j∈AL (e)
=
=
i∈BR (e)
j∈AR (e)

n i2 
i∈BL (e)
αL (ω, e) · βR (ω, e) + αR (ω, e) · βL (ω, e)
αL (ω, e) · (1 − βL (ω, e)) + (1 − αL (ω, e)) · βL (ω, e).
(e)
We can now define γD (ω), the contribution of the oriented edge e ∈ E(TD ) to the simplex gap evaluation
γD (ω), in a natural and obvious way:
�
�
(e)
(e)
(e)
γD (ω) = RD (ω) − LD (ω) · |e|.
26
As a result we get the following simplex gap decomposition automatically:
�
(e)
γD (ω) = RD (ω) − LD (ω) =
γD (ω).
e∈E(TD )
Setting α = αL (ω, e) and β = βL (ω, e) we see, from the preceding computations, that:
�
�
(e)
(e)
(e)
γD (ω) =
RD (ω) − LD (ω) · |e|
=
=
=
=
=
(α · (1 − β) + (1 − α) · β − α · (1 − α) − β · (1 − β)) · |e|
(α2 − 2αβ + β 2 ) · |e|
(α − β)2 · |e|
(αL (ω, e) − βL (ω, e))2 · |e|
(αR (ω, e) − βR (ω, e))2 · |e|.
(e)
Now sum γD (ω) over all e ∈ E(TD ) to get the stated formulas for γD (ω).
(e)
If either vertex of an oriented edge e is a leaf in the minimal subtree TD , then clearly γD (ω) > 0 and
hence the simplex gap γD (ω) > 0, establishing the final statement of the theorem.
�
Corollary 5.17. Every finite metric tree has strict 1-negative type.
In addition to finite metric trees, Hjorth et al. [7] and Hjorth et al. [8] have elaborated and studied several
other classes of finite metric spaces which have strict 1-negative type. These include — under appropriate
restrictions — finite metric spaces whose elements have been chosen from a Riemannian manifold (and
endowed with the natural inherited distances).
Using Theorem 5.16 as a spring board it is possible to compute the 1-negative type gap of any finite
metric tree exactly. (We refrain from proving the ensuing formula in these notes but comment that it can
be derived by using Lagrange multipliers and Theorem 5.16 or by using a matrix theory approach. Both
approaches are interesting in their own right.)
Theorem 5.18 (Doust and Weston [2]). Let (T, d) be a finite metric tree. Let ΓT = inf γD (ω) denote the
D(ω )
1-negative type gap of (T, d). Then:
�
�−1
�
−1
ΓT =
|e|
.
e∈E(T )
Notice that the constant ΓT in Theorem 5.18 is independent of the internal geometry of the tree T and
depends only upon the unordered distribution of the tree’s edge weights. By way of analogy; the situation
we are encountering in Theorem 5.18 is to be compared to having a box of matches of unequal lengths. No
matter how we construct a metric tree T by using all of the matches in the box, we invariably get the same
value for the 1-negative type gap ΓT .
6. A quantitative lower bound on the generalized roundness of a finite metric space
The purpose of this section is to make clear the practicality of knowing the p-negative gap ΓpX of a finite
metric space (X, d). The results of this section are taken from [11].
The observation is made in [2, Theorem 5.2] that if the p-negative type gap ΓpX of a finite metric space
(X, d) is positive for some p ≥ 0, then (X, d) must have strict q-negative type on some interval of the form
[p, p + ζ) where ζ > 0. However, the authors only provide an explicit value for ζ in the case p = 1. Letting
n = |X|, the value of ζ given in this case is O(1/n2 ). (See [2, Theorem 5.1].) The purpose of the present
27
section is to give a precise quantitative version of [2, Theorem 5.2] which yields significantly improved values
of ζ for all p ≥ 0. In fact, for each p ≥ 0, our value of ζ is O(1). The precise statement of this result is given
in Theorem 6.6. As an application we obtain significantly improved lower bounds on the maximal p-negative
type of finite metric trees. These are stated in Corollary 6.7. Then in Remark 6.8 we point out that the
estimates given in Corollary 6.7 are actually close to best possible for finite metric trees that resemble stars.
This suggests there is little room for improvement in the statement of Theorem 6.6, the main result of this
section.
The proof of Theorem 6.6 is facilitated by the following two technical lemmas which are easily realized
using basic calculus or by simple combinatorial arguments.
Lemma 6.1. Let s ∈ N. If s real variables �1 , . . . , �s > 0 are subject to the constraint �1 + · · · + �s = 1, then
the expression
�
� k 1 � k2
k1 <k2
has maximum value
s(s−1)
2
·
1
s2
= 12 (1 − 1s ) which is attained when �1 = · · · = �s = 1s .
Problem 6.2. Prove Lemma 6.1.
Lemma 6.3. Let s, t ∈ N and let m = s + t. Then
�
�
�
�
�
�
1
1
1
1
1
1
1
1−
+
1−
≤1−
+
.
2
s
2
t
2 �m
�m
2�
2�
�
�
1
1
Moreover, the function ϕ(m) = 1 − 12 � m
+
increases strictly as m increases.
m
�
� �
2
2
Problem 6.4. Prove Lemma 6.3.
We will continue to use the notation ϕ(m) = 1 −
1
2
�
−1
−1 �
· �m
+ �m
introduced in the preceding lemma
2�
2�
throughout the remainder of this section as it allows the efficient statement and succinct proof of certain key
formulas such as Theorem 6.6.
The following basic notions are also relevant to the proof of Theorem 6.6.
Definition 6.5. Let (X, d) be a metric space.
(1) The metric diameter of (X, d) is given by the quantity diam X = sup{d(x, y)|x, y ∈ X}.
(2) Provided |X| < ∞, the scaled metric diameter of (X, d) is given by the ratio
DX = (diam X)/ min{d(x, y)|x �= y}.
(3) If d(x, y) = 1 for all x �= y, then d is called the discrete metric on X.
The idea of the following theorem is that if ΓpX > 0 in the case of a finite metric space (X, d), then the left
side of each generalized roundness inequality (4.1) is bounded away from the right side by at least ΓpX . We
therefore expect that the exponent p in each inequality (4.1) may be increased (slightly) to some value q > p
in this setting without undermining inequality. It is a simple but powerful idea as demonstrated in the proof
of the following theorem.
Theorem 6.6. Let (X, d) be a finite metric space with cardinality n = |X| ≥ 3 and let p ≥ 0. If the
p-negative type gap ΓpX of (X, d) is positive, then (X, d) has q-negative type for all q ∈ [p, p + ζ] where
�
�
Γp
X
ln 1 + (diam X)
p ·ϕ(n)
ζ =
.
ln DX
28
Moreover, (X, d) has strict q-negative type for all q ∈ [p, p + ζ). In particular, p + ζ provides a lower bound
on the supremal (strict) q-negative type of (X, d).
Proof. For notational ease we set Γ = ΓpX and D = DX throughout this proof. We may assume that
the metric d is not a positive multiple of the discrete metric on X. Otherwise, (X, d) would have strict
q-negative type for all q ≥ 0. Hence D > 1.
Since scaling the metric by a positive constant has no effect on whether the space has p-negative type,
we may assume that min{d(x, y)|x �= y} = 1. This means that D is now the diameter of our rescaled metric
space (which we will continue to denote by (X, d)). Moreover, for all � = d(x, y) �= 0 and all ζ > 0, we have
�p+ζ − �p ≤ Dp+ζ − Dp . This is because, for any fixed ζ > 0, the function f (x) = xp+ζ − xp is increasing on
the interval [1, ∞). This inequality will be used in the derivation of (6.3) below.
Consider an arbitrary normalized (s, t)-simplex D = [aj (mj ); bi (ni )]s,t in X. Necessarily, m = s + t ≤ n.
For any given r ≥ 0, let
L(r)
=
�
mj1 mj2 d(aji , aj2 )r +
j1 <j2
R(r)
=
�
�
ni1 ni2 d(bi1 , bi2 )r , and
i1 <i2
r
mj ni d(aj , bi ) .
j,i
By definition of the p-negative type gap Γ we have
L(p) + Γ ≤ R(p).
(6.1)
The strategy of the proof is to argue that
L(p + ζ) < L(p) + Γ
R(p) ≤ R(p + ζ)
and
(6.2)
provided ζ > 0 is sufficiently small. If so, then L(p + ζ) < R(p + ζ) by (6.1) and (6.2). In other words, (X, d)
has strict (p + ζ)-negative type under these circumstances. Now, as all non-zero distances in (X, d) are at
least one, we automatically obtain the second inequality of (6.2) for all ζ > 0. Therefore we only need to
concentrate on the first inequality of (6.2). First of all, notice that
�
�
�
L(p + ζ) − L(p) =
mj1 mj2 d(aj1 , aj2 )p+ζ − d(aj1 , aj2 )p
j1 <j2
+
≤
≤
≤
=
≤
�
�
�
�
i1 <i2
�
�
�
ni1 ni2 d(bi1 , bi2 )p+ζ − d(bi1 , bi2 )p
mj1 mj2 +
j1 <j2
1−
1
2
�
i1 <i2
�
1 1
+
s
t
��
�
�
�
ni1 ni2 · Dp+ζ − Dp
�
�
· Dp+ζ − Dp
�
��
�
�
1
1
1
1−
· Dp+ζ − Dp
m + m
2 �2� �2�
�
�
ϕ(m) · Dp+ζ − Dp
�
�
ϕ(n) · Dp+ζ − Dp ,
29
(6.3)
by applying Lemmas 6.1 and 6.3. Now observe that:
�
�
ϕ(n) · Dp+ζ − Dp ≤ Γ
�
ln 1 +
iff
ζ≤
Γ
Dp ·ϕ(n)
ln D
�
.
(6.4)
By combining (6.3) and (6.4), we obtain the first inequality of (6.2) for all ζ > 0 such that
�
�
Γ
ln 1 + Dp ·ϕ(n)
ζ < ζ0 =
.
ln D
Hence L(p + ζ) < R(p + ζ) for any such ζ. It is also clear from (6.2), (6.3) and (6.4) that L(ζ0 ) ≤ R(ζ0 ).
These observations and descaling the metric (if necessary) complete the proof of the theorem.
�
Recall that the ordinary path metric ρ on a finite tree T assigns length one to each edge in the tree (with all
other distances determined geodesically). With this in mind, we see that Theorem 6.6 provides a significant
improvement of the estimate given in [2, Corollary 5.5].
Corollary 6.7. Let T be a finite tree on n = |T | ≥ 3 vertices endowed with the ordinary path metric ρ.
Let D denote the metric diameter of the resulting finite metric tree (T, d). Let q(T ) denote the maximal
p-negative type of (T, d). Then:
q(T )
≥
Proof. By Theorem 5.18, Γ1T =
� �
1 + ln 1 +
1
n−1 .
��
�
1
ln D .
D · (n − 1) · ϕ(n)
Now apply Theorem 6.6 with p = 1.
(6.5)
�
Remark 6.8. The lower bound on q(T ) given in the statement of Corollary 6.7 is basically of the correct
order of magnitude when D = 2. To see this, first of all notice that if n > 2 is even and D = 2, then (6.5)
in Corollary 6.7 simplifies to give:
� �
q(T ) ≥ 1 + ln 1 +
��
�
n
ln 2 .
2(n − 1)(n − 2)
However, if T denotes the star with n−1 leaves (endowed with the ordinary path metric ρ), then [2, Theorem
5.6] gives the exact value:
� �
q(T ) = 1 + ln 1 +
��
�
1
ln 2 .
n−2
7. Supremal p-negative type of a finite metric space cannot be strict
If the p-negative type gap ΓpX of a metric space (X, d) is positive then (X, d) clearly has strict p-negative
type. It is interesting to ask to what extent — if any — the converse of this statement is true. The next
result points out that the converse statement is always true in the case of finite metric spaces. By way of a
notable contrast, [2, Theorem 5.7] shows that there exist infinite metric trees (X, d) of strict 1-negative type
with 1-negative type gap Γ1X = 0.
Theorem 7.1. Let p ≥ 0 and let (X, d) be a finite metric space. Then (X, d) has strict p-negative type if
and only if ΓpX > 0.
Proof. Let p ≥ 0 be given. We need only concern ourselves with the forward implication of the theorem
since the converse is clear from the definitions.
p
Assume that (X, d) is a finite metric space with strict p-negative type. By Theorem 5.2, γD
(ω) > 0 for
each normalized (s, t)-simplex D(ω) ⊆ X. Referring back to Definitions 5.1 and 5.3 we further note that we
30
p
may assume that each such p-negative type simplex gap γD
is defined on the compact set Ns,t ⊂ Rs+t and
is positive at each point of Ns,t . Therefore
min
�
p
γD
(ω) | ω
�
∈ Ns,t > 0
for each (s, t)-simplex D in X. But as |X| < ∞ the number of distinct (s, t)-simplexes D that can be formed
from X must be finite. Thus the p-negative type gap ΓpX is seen to be the minimum of finitely many positive
quantities. As such we obtain the desired result: ΓpX > 0.
�
Corollary 7.2. Let p ≥ 0 and let (X, d) be a finite metric space. If (X, d) has strict p-negative type, then
(X, d) must have strict q-negative type for some interval of values q ∈ [p, p + ζ), ζ > 0.
Proof. By Theorem 7.1, Γ = ΓpX > 0. Now apply Theorem 6.6.
�
As an immediate consequence of Corollary 7.2 we obtain an important property of finite metric spaces.
Corollary 7.3. The supremal p-negative type of a finite metric space cannot be strict.
Moreover, since p-negative type holds on closed intervals, we therefore obtain an interesting case of equality
in the classical negative type inequalities as a direct consequence of Corollary 7.3.
Corollary 7.4. Let (X, d) be a finite metric space. Let q denote the supremal p-negative type of (X, d). If
q
q < ∞ then there exists a normalized (s, t)-simplex D(ω) = [aj (mj ); bi (ni )]s,t in X such that γD
(ω) = 0. In
other words, we obtain:
�
mj1 mj2 d(aj1 , aj2 )
1≤j1 <j2 ≤s
q
+
�
ni1 ni2 d(bi1 , bi2 )
q
=
1≤i1 <i2 ≤t
s,t
�
mj ni d(aj , bi )q .
j,i=1
Corollary 7.5. The following finite metric spaces all have strict q-negative type for some interval of values
q ∈ [1, 1 + ζ) (where ζ > 0 depends upon the particular space):
(a) Any three-point metric space.
(b) Any finite metric tree.
(c) Any finite isometric subspace of a k-sphere Sk (endowed with the usual geodesic metric) that contains
at most one pair of antipodal points.
(d) Any finite isometric subspace of the hyperbolic space HkR (or HkC ).
(e) Any finite isometric subspace of a Hadamard manifold.
Proof. All of the above finite metric spaces have strict p-negative type for p = 1 by results given in [7]
and [8]. We may therefore apply Corollary 7.2 en masse.
�
8. Appendix One: A closer look at Hilbert spaces
The purpose of this appendix is to take a more formal look at the geometry of Hilbert spaces over the
complex field. Most results in this appendix are equally valid for Hilbert spaces over the real field (and it
will be clear when they are not). So, all vector spaces under consideration in this appendix are assumed to
be over the complex field C unless explicitly stated otherwise.
An inner product on a complex vector space X is a “sesquilinear form” on X with some additional
properties that lead (for example) to a norm on X which has particularly nice geometric properties. The
basic ideas are as follows.
31
Definition 8.1. A sesquilinear form on a complex vector space X is a map (· � ·) : X × X → C that is linear
in the first variable and conjugate linear in the second variable. That is to say:
(1) (αx + βy � z) = α(x � z) + β(y � z), and
(2) (z � αx + βy) = α(z � x) + β(z � y), for all x, y, z ∈ X and all α, β ∈ C.
If it is also the case that we have
(3) (x � y) = (y � x) for all x, y ∈ X,
then we say that the sesquilinear form (· � ·) on X is self adjoint. In this case, conditions (1) and (3)
automatically imply condition (2).
Notice that if (· � ·) is a sesquilinear form on a complex vector space X (which is not necessarily self
adjoint), then a new sesquilinear form (· � ·)∗ on X may be defined as follows:
(x � y)∗ = (y � x) for all x, y ∈ X.
We call (· � ·)∗ the adjoint form of (· � ·). The condition (3) of being self adjoint requires (· � ·) ≡ (· � ·)∗ .
A direct computation shows that if (· � ·) is a sesquilinear form on a complex vector space X, then the
polarization identity
3
(x � y)
holds for all x, y ∈ X.
=
1� k
i (x + ik y � x + ik y)
4
(8.1)
k=0
Problem 8.2. Derive the polarization identity 8.1 for a sesquilinear form on a complex vector space.
Definition 8.3. An inner product on a complex vector space X is a self adjoint sesquilinear form (· � ·) on
X with two additional properties:
(4) [Positivity] (x � x) ≥ 0 for all x ∈ X, and
(5) if (x � x) = 0 then x = 0.
In summary, an inner product on a complex vector space X is a map (· � ·) : X × X → C that satisfies
conditions (1) through (5) as delineated above. The resulting pair (X, (· � ·)) is called an inner product space.
Remark 8.4. Notice that two vectors x, y in an inner product space (X, (· � ·)) are equal if and only if
(x � z) = (y � z) for all z ∈ X. This follows from conditions (1) and (5) above. In particular, if (x � z) = 0 for
all z ∈ X, then x = 0.
A key property of any inner product (· � ·) on a complex vector space X is the Cauchy-Schwarz inequality:
|(x � y)|
≤
(x � x)1/2 · (y � y)1/2 for all x, y ∈ X.
(8.2)
Proof. To prove this inequality we may assume that (x � y) is real. (If not, simply replace x by αx for
a suitable complex number α with |α| = 1. This does not change either side of the inequality.) Then note
that for given fixed vectors x, y ∈ X
0 ≤ (tx + y � tx + y) = t2 (x � x) + 2t(x � y) + (y � y)
is a non negative real quadratic in the real variable t ∈ R. Thus the discriminant of this quadratic, namely
4(x � y)2 − 4(x � x)(y � y), is non positive. This leads directly to the Cauchy-Schwarz inequality.
32
�
If (· � ·) is an inner product on a complex vector space X we may consider the related “homogeneous”
function � · � : X → R+ : x �→ (x � x)1/2 . For any x, y ∈ X, we see that
�x + y�2
=
(x + y � x + y)
=
(x � x) + (y � y) + (x � y) + (y � x)
=
≤
�x�2 + �y�2 + 2�(x � y)
�x�2 + �y�2 + 2|(x � y)|
≤
�x�2 + �y�2 + 2�x� · �y�
=
(�x� + �y�) ,
2
by the Cauchy-Schwarz inequality. It is therefore clear that � · � is a norm on X. Moreover, the second
equality in the above computation also implies the parallelogram law for this (inner product) induced norm:
�x + y�2 + �x − y�2 = 2(�x�2 + �y�2 ) for all x, y ∈ X.
Conversely, if a norm � · � on a complex vector space X satisfies the parallelogram law, then there must
be an inner product (· � ·) on X such that �x� = (x � x)1/2 for all x ∈ X. Motivated by the polarization
identity, the idea is to define
3
(x � y) =
1� k
i · �x + ik y�
4
k=0
for all x, y ∈ X, and then directly check that everything works.
Definition 8.5. A complex normed space (X, � · �) is said to be a Euclidean space (or, a pre-Hilbert space)
if there exists an inner product (· � ·) on X such that �x� = (x � x)1/2 for all x ∈ X.
It is not too difficult to see that the inner product on a Euclidean space (X, � · �) must be continuous as a
mapping X × X → C. This is a simple consequence of the Cauchy-Schwarz inequality 8.2.
Theorem 8.6. The inner product (· � ·) : X × X → C on a Euclidean space (X, � · �) is necessarily a
continuous map.
Problem 8.7. Prove Theorem 8.6
Evidently, from what we have said above, a complex normed space (X, � · �) is a Euclidean space if and only
if its norm satisfies the parallelogram law :
�x + y�2 + �x − y�2 = 2(�x�2 + �y�2 ) for all x, y ∈ X.
In particular, it follows that a complex normed space is a Euclidean space if and only if each of its two
dimensional subspaces is a Euclidean space. We now revise our preliminary definition of a Hilbert space.
Definition 8.8. A complete Euclidean space is called a Hilbert space. Put differently, a Hilbert space is a
Banach space whose norm can be induced from an inner product.
Obviously any subspace of a Euclidean space is a Euclidean space. And, any closed subspace of a Hilbert
space is a Hilbert space. We now turn to some canonical examples of Euclidean and Hilbert spaces. The
(n)
first two examples revisit �2
and �2 .
33
(n)
Example 8.9. Let n ∈ N be given. The Hilbert space �2
consists of the complex vector space Cn together
with the familiar “dot” product of elementary linear algebra:
(x � y) =
n
�
j=1
xj yj for all x = (xj ), y = (yj ) ∈ Cn .
The norm on Cn induced by this inner product is the familiar Euclidean norm:

1/2
n
�
�x�2 = 
|xj |2 
for all x = (xj ) ∈ Cn .
j=1
Example 8.10. Let S2 denote the complex vector space of all square summable sequences of complex
numbers8 endowed with the usual pointwise vector operations (x + y = (xj + yj ), and so on). The Hilbert
space �2 consists of S2 endowed with the natural inner product:
(x � y) =
∞
�
j=1
xj yj for all x = (xj ), y = (yj ) ∈ S2 .
This inner product on S2 induces the �2 -norm:

1/2
∞
�
�x�2 = 
|xj |2 
for all x = (xj ) ∈ S2
j=1
Example 8.11. Suppose a, b ∈ R with a < b. Consider C([a, b]) the complex vector space of all continuous
functions f : [a, b] → C endowed with the usual pointwise vector operations ((f + g)(t) = f (t) + g(t) for all
t ∈ [a, b], and so on). A natural inner product on C([a, b]) is obtained via the Riemann integral as follows:
(f � g)
=
�b
a
f (t)g(t) dt for all f, g ∈ C[a, b].
(8.3)
This inner product induces an �2 -norm on C([a, b]):
 b
1/2
�
�f �2 =  |f (t)|2 dt
for all f ∈ C([a, b]).
a
However, (C([a, b]), � · �2 ) is not complete, and so it is a Euclidean space which is not a Hilbert space.
Example 8.12. More generally, we may replace C([a, b]) in the previous example with the complex vector
space of all Lebesgue measurable functions f : [a, b] → C such that
�b
a
|f (t)|2 dt < ∞.
Proceeding with the analogous inner product to (8.3) in the previous example, we obtain the Hilbert space
L2 ([a, b]) where the induced norm is the expected �2 -norm. Standard real analysis shows that (C([a, b]), �·�2 )
is a dense subspace of (L2 ([a, b]), � · �2 ).
Definition 8.13. A non empty subset C of a Hilbert space (H, � · �) is said to be convex if, for all x, y ∈ C
and all t ∈ [0, 1], we have tx + (1 − t)y ∈ C. (This means that the “line segment” joining x to y lies in C.)
8Recall that a sequence (x ) of complex numbers is said to be square summable if the infinite series � |x |2 converges.
n
n
34
Theorem 8.14. Let F be a non-empty closed and convex subset of a Hilbert space (H, �·�). Then F contains
a unique vector of minimal norm. More generally, given any y ∈ H there exists a unique vector x0 ∈ F such
that
�y − x0 � = inf �y − x�.
x∈F
Proof. We need only worry about the particular case: y = 0. (If y �= 0 we may consider the translate
F − y.) Let α = inf �x�, the metric distance from F to y = 0.
x∈F
Consider arbitrary vectors x, z ∈ F . According to the convexity of F we have 12 (x + z) ∈ F . From this
it follows that �x + z�2 ≥ 4α2 . Hence, by the parallelogram law,
2(�x�2 + �z�2 )
=
≥
�x + z�2 + �x − z�2
4α2 + �x − z�2 .
(8.4)
Now, by definition of α, there exists a sequence (xn ) in F such that lim �xn � = α. As a result, via (8.4) with
n→∞
x = xm and z = xn , we see that the right hand side of the inequality �xm − xn �2 ≤ 2(�xm �2 + �xn �2 ) − 4α2
becomes arbitrarily small as m, n → ∞. This shows that (xn ) is a Cauchy sequence in the closed set F ⊆ H.
So, by completeness, there must exist an x0 ∈ F such that xn → x0 as n → ∞. In particular,
�x0 � = lim �xn � = α,
n→∞
which wraps up the existence part of the theorem.
Finally, if z0 has the same properties as x0 , we see from (8.4) that
4α2 ≥ 4α2 + �x0 − z0 �2
�
which forces uniqueness: x0 = z0 .
References
[1] A. Cayley, On a theorem in the geometry of position, Cambridge Mathematical Journal II (1841), 267–271. (Also in
The Collected Mathematical Papers of Arthur Cayley (Vol. I), Cambridge University Press, Cambridge (1889), pp.
1–4.) 1, 15
[2] I. Doust and A. Weston, Enhanced negative type for finite metric trees, J. Funct. Anal. 254 (2008), 2336–2364. (See
arXiv:0705.0411v2 for an extended version of this paper.) 17, 18, 21, 27, 28, 30
[3] P. Enflo, On the nonexistence of uniform homeomorphisms between Lp -spaces, Ark. Mat. 8 (1968), 103–105. 12, 13,
14, 15
[4] P. Enflo, On a problem of Smirnov, Ark. Mat. 8 (1969), 107–109. 12, 15
[5] P. Enflo, Uniform structures and square roots in topological groups I, Israel J. Math. 8 (1970), 230–252. 12
[6] P. Enflo, Uniform structures and square roots in topological spaces II, Israel J. Math. 8 (1970), 253–272. 12
[7] P. G. Hjorth, S. L. Kokkendorff and S. Markvorsen, Hyperbolic spaces are of strictly negative type, Proc. Amer. Math.
Soc. 130 (2002), 175–181. 12, 27, 31
[8] P. Hjorth, P. Lisoněk, S. Markvorsen and C. Thomassen, Finite metric spaces of strictly negative type, Linear Algebra
Appl. 270 (1998), 255–273. 24, 27, 31
[9] J. -F. Lafont and S. Prassidis, Roundness properties of groups, Geom. Dedicata 117 (2006), 137–160. 12
[10] J. R. Lee, A. Naor and Y. Peres, Trees and Markov convexity, Geom. Funct. Anal. 18 (2009). 18
[11] H. Li and A. Weston, Strict p-negative type of a metric space, Positivity 14 (2010), 529–545. 27
[12] C. J. Lennard, A. M. Tonge and A. Weston, Generalized roundness and negative type, Mich. Math. J. 44 (1997),
37–45. 17
[13] K. Menger, Die Metrik des Hilbert-Raumes, Akad. Wiss. Wien Abh. Math.-Natur. K1 65 (1928), 159–160. 15
[14] G. F. Simmons, Introduction to Topology and Modern Analysis, International Series in Pure and Applied Mathematics, McGraw-Hill, xv+1–372. 1
35
[15] I. J. Schoenberg, Remarks to Maurice Frechet’s article “Sur la définition axiomatique d’une classe d’espaces distanciés
vectoriellement applicable sur l’espace de Hilbert.”, Ann. Math. 36 (1935), 724–732. 1, 15
[16] I. J. Schoenberg, On certain metric spaces arising from euclidean spaces by a change of metric and their imbedding
in Hilbert space, Ann. Math. 38 (1937), 787–793. 1, 11, 15, 16
[17] I. J. Schoenberg, Metric spaces and positive definite functions, Trans. Amer. Math. Soc. 44 (1938), 522–536. 1, 15
Anthony Weston, Department of Mathematics and Statistics, Canisius College, Buffalo NY 14208, USA
36
Download