Defining Quantiles for Functional Data

advertisement
The University of Melbourne
Department of Mathematics and Statistics
Defining Quantiles for Functional Data
with an Application to the Reversal of Stock Price Decreases
Simon Walter
Supervisor: Peter Hall
1.2
1.1
1.0
0.9
0.8
0.7
10
20
Honours Thesis
October 28, 2011
30
40
Preface
This thesis introduces and motivates a surrogate to quantiles for functional data defined on a
closed interval of the real line. The central results are Theorems 3.6 and 3.9, which show that the
quantiles introduced characterise the distribution of a functional random variable under relatively
weak conditions. Three applications of these quantiles are also suggested: describing functional
datasets, detecting outliers, and constructing functional qq-plots.
In univariate data analysis, statistics based on quantiles are versatile and popular: the median
measures central tendency; inter-quartile range measures dispersion; quantile-based statistics
can measure skewness and kurtosis; and qq-plots are used to assess the fidelity of a sample to a
specified distribution or the equality of the distribution of two samples. It is therefore natural
to try to extend the definition of quantiles to multi-dimensional data. Quantiles are defined by
ordering values of a random variable; and since there is no natural order for Rn when n ≥ 2, there
is no trivially obvious extension. Nonetheless, substantial progress has been made: the definition
of multivariate quantiles given by Chaudhuri (1996) is perhaps the most successful, but several
approaches are reviewed in Serfling (2002). Easton & McCulloch (1990) and Liang & Ng (2009)
show that a plots with similar characteristics as the univariate qq-plot can be constructed for
multivariate distributions. More recently, Fraiman & Pateiro-López (2011) have suggested a
definition of quantiles tailored for use in functional data analysis. Fraiman & Pateiro-López
(2011) suggest a similar definition to the multivariate quantiles described in Kong & Mizera
(2008).
This thesis takes a different approach; we exploit the fact that functional data are defined on
a continuum and appeal directly to the definition of univariate quantiles. There is no analogue
to this approach for finite-dimensional multivariate distributions. We will see that functional
quantiles defined in this way share many (but not all) of the characteristics of univariate quantiles.
These quantiles are also substantially easier to compute and analyse than the quantiles proposed
by Fraiman & Pateiro-López (2011) and Chaudhuri (1996).
The first part of this thesis is a literature review; it does not contain any original material.
Chapter 1 is a selective introduction to functional data analysis; Chapter 2 describes existing
methods for extending quantiles to multivariate distributions. The original contribution of this
thesis is contained entirely in the second part: Chapter 3 describes a generalisation of quantiles for
functional data; proves theoretical characteristics of the quantiles defined and suggests methods
1
2
to estimate them; Chapter 4 describes applications of the quantiles defined; and Chapter 5
highlights deficiencies in the quantiles developed and suggests further avenues of fruitful enquiry.
The functional quantiles introduced are demonstrated by analysing the historical stock prices
of NASDAQ listed companies to determine whether large single-day declines are disproportionately followed by a partial reversal in subsequent trading. The dataset used is described in
Section 1.5.
The following conventions will be followed: ‘⊂’ will be used in its wide interpretation, that
is A ⊂ B includes A = B; ‘c0 will be used to denote complementation, so Ac = Ω\A; and ‘| · |’
will be used to denote the cardinality of a set. ‘⊥⊥’ will be used to denote independence, so
if X and Y are random variables then X ⊥⊥ Y means X and Y are independent; ‘∼’ will be
denote equality of distribution, so X ∼ Y iff X and Y have the same distribution, similarly if
X Y then X and Y do not have the same distribution; 1{condition} will be used to denote the
indicator function so 1{condition} = 1 if the condition is true and 0 if it is not; I will be used
to denote an arbitrary closed interval of the real line [a, b] where a, b ∈ R and a < b; and ∂ will
be used to denote the partial derivative of a function with respect to t. Important definitions,
theorems and lemmas will be framed. All other symbols used have standardised meanings.
I would like to acknowledge my supervisor, Peter Hall, for his generosity with both his time
and ideas. All inaccuracies and any infelicities of language are, however, mine and mine alone.
This work was supported by the Maurice H. Belz Scholarship.
Contents
I
Literature Review
5
1 Functional data analysis
6
1.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6
1.2
Formal definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7
1.2.1
Preliminary definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7
1.2.2
Defining functional random variables . . . . . . . . . . . . . . . . . . . . .
10
1.2.3
1.3
Defining functional datasets . . . . . . . . . . . . . . . . . . . . . . . . . .
10
Constructing functional datasets . . . . . . . . . . . . . . . . . . . . . . . . . . .
11
1.3.1
Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
12
1.3.2
Basis smoothing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
13
1.3.3
Kernel smoothing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14
1.4
Registration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
15
1.5
A sample dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
16
2 Multivariate quantiles
2.1
The failure of the ordinary approach . . . . . . . . . . . . . . . . . . . . . . . . .
2.2
Assessing the quality of multivariate quantiles . . . . . . . . . . . . . . . . . . . .
20
2.3
Methods for constructing multivariate quantiles . . . . . . . . . . . . . . . . . . .
21
2.3.1
Norm minimisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
21
2.3.2
Functional quantiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
21
Room for improvement? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
22
2.4
II
19
Quantiles for Functional Data
19
23
3 Defining functional quantiles
3.1 Functional quantiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
24
24
3.1.1
Defining functional quantiles . . . . . . . . . . . . . . . . . . . . . . . . .
24
3.1.2
Properties of functional quantiles . . . . . . . . . . . . . . . . . . . . . . .
26
3.1.3
Characterising a functional random variable . . . . . . . . . . . . . . . . .
27
3
CONTENTS
3.2
4
Empirical functional quantiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
35
3.2.1
Defining empirical quantiles . . . . . . . . . . . . . . . . . . . . . . . . . .
35
3.2.2
Properties of empirical quantiles . . . . . . . . . . . . . . . . . . . . . . .
36
3.2.3
Missing data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
40
4 Applications
4.1
4.2
4.3
42
Describing functional datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
42
4.1.1
Location . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
42
4.1.2
Dispersion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
44
4.1.3
Skewness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
44
4.1.4
Alternatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
45
Detecting outliers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
45
4.2.1
The classification process . . . . . . . . . . . . . . . . . . . . . . . . . . .
46
4.2.2
An example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
46
4.2.3
Alternatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
48
QQ-plots
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
48
4.3.1
Univariate qq-plots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
48
4.3.2
Functional qq-plots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
49
4.3.3
Alternatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
51
5 Epilogue
5.1
5.2
52
Further refinements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
52
5.1.1
Characterising a functional random variable . . . . . . . . . . . . . . . . .
52
5.1.2
More robust measures of dispersion and skewness . . . . . . . . . . . . . .
53
5.1.3
Better identification of outliers . . . . . . . . . . . . . . . . . . . . . . . .
53
5.1.4
Problems with functional qq-plots . . . . . . . . . . . . . . . . . . . . . .
54
Future research directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
54
5.2.1
Using quantiles to construct formal tests . . . . . . . . . . . . . . . . . . .
54
5.2.2
A goodness-of-fit test based on functional qq-plots . . . . . . . . . . . . .
54
Bibliography
56
Part I
Literature Review
5
Chapter 1
Functional data analysis
This chapter reviews the areas of functional data analysis required to apply the original methods
developed in this thesis. In Section 1.2 we examine the definitions at the core of functional data
analysis; in Section 1.3 methods for constructing functional datasets from a finite number of
points are described; and in Section 1.4 popular methods for registering or aligning functional
data to remove uninformative phase variation are illustrated.
There is a large and rapidly expanding literature on functional data analysis. A complete
introduction is available in Ramsay & Silverman (1997, 2005) and Ferraty & Vieu (2006); a
description of the state of the art is available in Ferraty (2010, 2011) and Ferraty & Romain
(2011).
1.1
Introduction
Functional data analysis is the analysis of data that may be modelled as being generated by a
random function varying over a continuum. The most common objects of study are continuous
time series, but general random curves, random surfaces and higher dimensional analogues are
also studied. The quantiles developed in this thesis are only immediately applicable to random
variables defined on a set of injective functions all with domain I where I is a closed interval of
R; so this introduction will focus primarily on the statistical analysis of such random variables.
To provide a feel for some typical analyses, Figure 1.1 plots examples of four canonical
datasets from Ramsay & Silverman (2005): the top right chart shows selected observations from
data collected by Jones & Bayley (1941) in the Berkeley Growth Study. This study measured the
heights of 61 children born in 1928 or 1929 at regular intervals from birth to the age of 18. Top
left shows annual temperature variation for a selection of Canadian weather stations. Bottom
left shows the US non-durable goods index and bottom right shows twenty recordings of the
force exerted between the thumb and index finger of a subject during a brief impulse where the
subject aimed to achieve a 10N force. The aspect of these datasets that makes them amenable to
6
CHAPTER 1. FUNCTIONAL DATA ANALYSIS
7
functional data analysis is that it is reasonable to suggest that, although measurements are only
obtained at discrete intervals, measurements could, in theory, be obtained at any time during the
period of analysis. It is therefore natural to analyse the data as if it were defined in continuous
time.
Functional data analysis is assuming increasing importance in statistics because of the growing
use of technology to collect very high-dimensional datasets that are often best analysed by
assuming the data vary over some continuum. Typical applications include: the detection of
forged signatures in Geenens (2011), the analysis of crime statistics in Ramsay & Silverman
(2002), and the intepretation of economic and financial time series in Cai (2011). Note, however,
that functional data analysis is not a viable option for every very high dimensional dataset; for
example, DNA microarray data must usually be treated as discrete data, and so should not be
modelled by a random function varying over a continuum.
1.2
Formal definitions
The informal definition given in the introduction can be recast to say that functional data analysis
is the analysis of data generated by a functional random variable. Defining functional random
variables concretely requires some preliminary definitions.
1.2.1
Preliminary definitions
Definition 1.1 (Metric spaces) The pair (M, d) is a metric space iff M is a set and d is a
function mapping M × M → R such that for each x, y, z ∈ M :
1. d(x, y) ≥ 0 (non-negativity)
2. d(x, y) = 0 iff y = x (identity of discernibles)
3. d(x, y) = d(y, x) (symmetry); and
4. d(x, z) ≤ d(x, y) + d(y, z) (triangle inequality)
The abbreviated notation M will be used to denote the metric space (M, d).
Metric spaces are too general for the methods developed later so we identify certain metric spaces
with useful characteristics.
Definition 1.2 (Compactness and connectedness)
1. A metric space M is compact iff for each {Uα }α∈A satisfying (Uα )α∈A ⊂ M and M =
S
S
α∈A Uα there is some finite set B ⊂ A such that M =
α∈B Uα .
2. A metric space M is connected iff there are no open sets U1 and U2 such that U1 ∩ U2 = ∅
and U1 ∪ U2 = M .
CHAPTER 1. FUNCTIONAL DATA ANALYSIS
8
3
8 0
4
9
1
5
2
6
7
0
4
8
3
9
5
1
2
8
6
7
4
2
8
3
5
9
3
1
0
4
2
5
17
9
6
3
8
6
9
1
4
5
20
3
6
9
8
4
1
5
7
3
2
00
4
9
1
5
0
2
66
77
7
8
3
0
4
9
1
5
2
6
7
8
0
3
4
9
5
1
2
6
7
8
0
3
4
9
5
2
1
6
7
8
8888888888
88800000000000
8
555
55
55
44
5
4444
8 00 9
5
22222
4
22
6666
5
666
29999999
4
6
99
777
244
7
6
3
33333333
7
2
8 0
9
4
5
6
33
9
2
6
4
111111
333
77
111
5
1
8
0
2
9
1
6
3
7
1
4
5
1
2
9
8
3
0
7
6
4
1
5
2
9
7
39
4
5
16 77
2
4
30
880
5
2
4
7
30
9
5
2
4
66
0
5
11
9
2
3
77
4
6
0
5
1
9
7
2
3
4
6
1
5
7
9
5
2
4
6
1
7
9
2
67
17
6
7
Canadian Weather Station Data
20
Temperature (C)
140
80 100
height (cm)
180
Berkeley Growth Data
8
10
0
−10
−20
−30
5
10
2
2 3 2
3
1 2
2 3
1
1 1
3
3 1
1
2
2
1
3 1
4
3
1
1
1 Rupert
4
2 Pr.
1
4
2
4
3
3 2
Montreal
2 2
3
4
3
3
4 Edmonton
15
4
Jan
age
4Resolute
Apr Jun
Sep
Dec
Month
Pinch Force Data
6
4
0
10
2
20
Force (N)
50
8
100
10
The US Nondurable Goods Index
Nondurable Goods Index
4
4 4 4
1920
1940
1960
Time
1980
2000
0.00
0.10
0.20
0.30
Seconds
Figure 1.1: Examples of functional datasets frequently used to describe new and existing techniques; the data and the constructed plots are drawn from Ramsay & Silverman (2005). The
plots were reconstructed by Ramsay et al. (2010).
CHAPTER 1. FUNCTIONAL DATA ANALYSIS
9
The domain of each function in the set on which a functional random variable is defined is a
continuum:
Definition 1.3 (Continuums)
1. A continuum is a compact connected metric space.
2. A continuum M is non-degenerate iff |M | > 1.
The standard examples of continuums are R and Rn for n ∈ N. Most functional data analysis
involves data defined on I1 or I1 × . . . × In where (Ii )i∈{1,...,n} is a closed interval of R.
One important characteristic of functional data is that they are uncountably infinite-dimensional;
it suffices to show any non-degenerate continuum is uncountably infinite to establish this characteristic.
Theorem 1.1 If M is a non-denegerate continuum then M is uncountable.
Proof Since |M | > 1 by Definition 1.3, there must exist two points x, y ∈ M such that x 6= y.
Now define the function f : M → R such that f (z) = d(z, x) and define r = d(x, y). Since
x 6= y, Definition 1.1(2) implies r > 0. Since (0, r) is uncountable it is sufficient to show that
f is a surjection on to (0, r). Pick r1 ∈ (0, r). Suppose, to get a contradiction, that there is
no point p ∈ M such that f (p) = r1 . Then there exists a partition of M into the open sets
{m ∈ M, f (m) > r1 } and {m ∈ M, f (m) < r1 }. This contradicts Definition 1.2(2), so we
conclude that M is uncountable.
Some further preliminary definitions are required to define random variables precisely.
Definition 1.4 (σ-algebras) Let Ω be a set and let F be a subset of 2Ω . F is a σ-algebra on
Ω iff three conditions are satisfied: (i) ∅ ∈ F; (ii) if the set A ∈ F then Ac ∈ F; and (iii) if the
S∞
sets (Ai )i∈{1,2,...} ∈ F then i=1 Ai ∈ F.
Definition 1.5 (Probability spaces) (Ω, F, P) is a probability space iff
1. The sample space, Ω, is an arbitrary set;
2. F is a σ-algebra on Ω.
3. The probability measure, P is a function mapping F → [0, 1] satisfying:
(i) if A ∈ F then P(A) ≥ 0 (non-negativity);
(ii) P(Ω) = 1 (unitarity); and
S∞
P∞
(iii) if {A1 , A2 , . . .} is a disjoint collection of elements of F then P( i=1 Ai ) = i=1 P(Ai )
(countable additivity).
Definition 1.6 (Random variables) Let (Ω, F, P) be a probability space, let E be a set and
E, a σ-algebra on E. A random variable is a function, X : (Ω, F) → (E, E) such that if B ∈ E
then the pre-image of B: X −1 (B) = {ω ∈ Ω : X(ω) ∈ B} ∈ F.
The abbreviated notation X : Ω → E will frequently be used to denote X : (Ω, F) → (E, E).
CHAPTER 1. FUNCTIONAL DATA ANALYSIS
10
4
1.5
1.0
3
0.5
0.2
0.4
0.6
0.8
1.0
2
1
2
3
4
5
6
7
-0.5
1
-1.0
-1.5
0
Figure 1.2: Examples of realisations of functional random variables. The chart on the left plots
three realisations of a standard Wiener process, the chart on the right plots three realisations of
a standard Poisson process.
1.2.2
Defining functional random variables
Now we have sufficient background to formulate a precise definition of functional random variables:
Definition 1.7 (Functional random variables) A random variable X : Ω → E is a
functional random variable iff every χ ∈ E is a function χ : M → F for some non-degenerate
continuum M.
X is a real functional random variable iff every χ ∈ E is a function mapping I → R
where I = [a, b] for a, b ∈ R and a < b.
In practice, many problems in functional data analysis choose E ⊂ C n , where C n is the set
of all continuous n-differentiable functions with domain I. Examples of realisations of functional
random variables are shown in Figure 1.2.
1.2.3
Defining functional datasets
A dataset generated by a functional random variable is a functional dataset; this is defined as a
straightforward extension of the univariate and multivariate dataset.
Definition 1.8 (Functional dataset) A set {X1 , X2 , . . . , Xn } is a functional dataset iff
for each i, j ∈ {1, . . . , n}, Xi ∼ X , for some functional random variable X ; and, Xi ⊥⊥ Xj if
i 6= j.
CHAPTER 1. FUNCTIONAL DATA ANALYSIS
11
An element of a functional dataset is called a functional datum. {χ1 , χ2 , . . . , χn } is
used to denote the non-random observation of a functional dataset.
So functional data analysis consists of making inferences and predictions about the functional
random variable X from the functional dataset {X1 , X2 , . . . , Xn }. The natural question that
arises at this point is how do we observe an uncountably infinite dimensional functional random
variable? The answer is that we do not; instead, we construct a functional datum from a finite
sample.
1.3
Constructing functional datasets
This thesis illustrates the construction of functional dataset only for the case of real functional
random variables; that is, functional random variables defined on a set of functions all with
domain I. The problem of constructing a functional dataset is usually solved using a two step
procedure. First, take a dense but finite sample of a single observation of a functional random
variable (in the context of the Berkeley Growth Study, this means we should identify a single
child and measure their height at frequent intervals). Second, using the dense sample reconstruct
the behaviour of the functional datum over its whole domain. The problem of data collection is
usually easily solved, so the focus of this section is the second step of this procedure. The data
we are able to observe is a finitely observed functional dataset:
Definition 1.9 (Finitely observed functional dataset)
The set {{X1 (t1,1 ), X1 (t1,2 ), . . . , X1 (t1,m1 )}, . . . , {Xn (tn,1 ), Xn (tn,2 ), . . . Xn (tn,mn )}} is a
finitely observed functional dataset iff {X1 , . . . , Xn } is a functional dataset generated by a
real functional random variable; and for each i ∈ {1, . . . , n}, j ∈ {1, . . . , mi } we have ti,j ∈ I
and Xi (ti,j ) is the value of Xi at ti,j .
Now the task is to construct the functional dataset using only information contained in the
finitely observed functional dataset. To have any hope of performing this construction we must
make assumptions about the behaviour of the functional random variable. To see why, consider
white Gaussian noise, {X (t)}t∈[0,1] , and suppose we observe a single realisation, {χ(t)}t∈[0,1] , at
only a finite number of points {χ(t1 ), χ(t2 ), . . . , χ(tn )} (see Figure 1.3). Now by construction,
X (t) ⊥
⊥ X (s) if s 6= t so information about the value of χ at {t1 , t2 , . . . , tn } provides no information at all about its behaviour at other times. It is, therefore, impossible to use the sample to
conclude anything about the values of χ at unobserved times, and impossible to reconstruct the
infinite dimensional functional datum χ.
CHAPTER 1. FUNCTIONAL DATA ANALYSIS
12
3
2
1
0
-1
-2
0.0
0.2
0.4
0.6
0.8
1.0
Time
Figure 1.3: White Gaussian noise. A functional datum, χ is shown in blue; the finite sequence
of points observed {χ(t1 ), χ(t2 ), . . . , χ(tn )} is shown in red.
1.3.1
Assumptions
Since the data analysed in statistics is almost always generated by natural phenomena it is
prudent to make assumptions that are usually satisfied by natural phenomena. In functional data
analysis, the standard approach is to make assumptions about the continuity and smoothness
of the functional random variable. Common choices are that a functional random variable is
continuous or that it is continuously differentiable:
Definition 1.10 Let X : Ω → E be a functional random variable.
1. X is continuous iff E ⊂ C 0 .
2. X is continuously differentiable iff E ⊂ C 1 .
The capacity for change of many natural systems is limited by the availability of energy
capable of effecting that change. Since the supply of such energy is usually gradual, many
systems can be accurately modelled with continuously differentiable functions. Even financial
markets, which are often modelled by non-differentiable stochastic processes, are controlled by the
waxing and waning of supply and demand, so some aspects of financial markets can be analysed
with continuously differentiable functional random variables; see, for instance, Cai (2011) and
Aneiros et al. (2011).
CHAPTER 1. FUNCTIONAL DATA ANALYSIS
13
Alternatively, if we wish to construct a functional dataset not well approximated by continuously differentiable functions, we can adopt a different approach; Bugni et al. (2009) analysed a
discontinuous non-differentiable functional random variable. Bugni et al. (2009) assumed instead
that the the data were piecewise constant and all points of discontinuity were observed. In other
situations, if the functional data are sampled very densely and there is very little noise in the
data, accurate reconstruction at small time intervals may not be especially important and we
may analyse the data after connecting the points with lines of constant gradient. This is the
technique that will be used for the analysis of the phenomenon of stock price reversal. We will
briefly review two of the most popular methods for constructing functional datasets: (1) basis
functions; and (2) kernel smoothing. A complete description is provided by Ramsay & Silverman
(2005).
1.3.2
Basis smoothing
The idea motivating this approach is that many functions can be approximated arbitrarily closely
by a linear combination of a set of suitably chosen basis functions: {φ1 , φ2 , . . .}. The basic choice
for {φ1 , φ2 , . . .} is the monomial basis:
{1, t, t2 , . . .}
If the data show evidence of periodicity a better choice may be the Fourier basis:
{1, sin(ωt), cos(ωt), sin(2ωt), cos(2ωt), . . .}
More exotic bases are occasionally used, including B-spline, I-spline and wavelet bases. Occasionally simple bases may suffice, including the constant basis, or the step function basis. See
Ramsay & Silverman (2005) for further information.
Once a suitable basis has been chosen the goal is to estimate the infinite vector {c0 , c1 , . . .},
using information contained in the finitely observed functional dataset so that:
∞
X
ĉj φj (t)
j=1
is an accurate approximation of the functional datum. In practice, we do not ordinarily estimate
an infinite vector; instead we truncate the basis after a finite number of terms, K, and then
estimate {c0 , c1 , . . . , cK } so that the sum:
χ̂(t) =
K
X
ĉj φj (t)
j=1
is an accurate approximation of the functional datum. Various methods have been developed for
CHAPTER 1. FUNCTIONAL DATA ANALYSIS
14
choosing K; the most convincing approach is to balance the bias of estimates caused by a small
K against the greater variance of estimates caused by a large K.
A common method for estimating {c0 , c1 , . . . , cK } is to use techniques developed for constructing least squares estimates to minimise:
SMSSE ({χ(t1 ), χ(t2 ), . . . , χ(tn )}|c) =
n
X

χ(ti ) −
i=1
K
X
2
cj φj (ti )
j=1
Ordinary least squares estimation assumes that the residual for each χ(ti ) is independent and
identical for each ti . Often this is an unrealistic assumption because measurement error may
depend on the precise value of t at which the observation is made or the level of the function
observed; approximation errors may also exhibit autocorrelation. To account for these complications, weights can be added to each term to allow for unequal variances and covariances of
residuals.
1.3.3
Kernel smoothing
Kernel smoothing is closely related to the use of basis functions, it is best applied when it is
reasonable to assume that functional data is continuously differentiable. The tacit assumption in
fitting continuously differentiable functions to finitely observed functional datasets is that values
of a function at point are close to values of the function at nearby points. Kernel smoothing
makes this assumption explicit. This sacrifices some of the generality of the basis smoothing but
more accurate estimates are often obtainable with smaller datasets.
The estimated functional datum χ̂ is expressed as a linear combination of all observed values:
χ̂(t) =
n
X
wj (t)χ(tj )
j=1
The weight, wj , of each observed χ(tj ) is expressed in terms of a kernel function. To assist in the
development of theoretical properties, kernel functions are assumed to be symmetric, continuous
probability density functions. Common choices are the uniform kernel, the Epanechnikov kernel
and the Gaussian kernel, respectively:
K(u) =

1
2
if |u| ≤ 1
0 elsewhere

 3 1 − u2 if |u| ≤ 1
K(u) = 4
0
elsewhere
2
1
u
K(u) = √ exp −
2
2π
CHAPTER 1. FUNCTIONAL DATA ANALYSIS
15
1.0
1.0
0.5
0.5
1
2
3
4
5
1
6
2
3
4
5
6
-0.5
-0.5
-1.0
-1.0
Figure 1.4: Phase and amplitude variation. The chart on the left shows functional data differing
only in phase and the chart on the right shows functional data differing only in amplitude.
Then a common choice for the weight function are Nadaraya–Watson weights after Nadaraya
(1964) and Watson (1964):
wj (t) =
t−tj
h
Pn
t−ti
i=1 K
h
K
where h is called the bandwidth of the kernel. A large bandwidth means more observations are
practically incorporated in the imputation of each point and the result is a smoother curve; a
small bandwidth means fewer values are incorporated and the result is a curve that is not as
smooth.
In general the bandwidth is selected by a similar compromise to the selection of the degree at
which the basis is truncated in Section 1.3.2 above: a small bandwidth gives an estimator that
is nearly unbiased but with high variance; a large bandwidth gives a biased estimator with low
variance. So h is chosen to balance the variance and bias of the estimator.
1.4
Registration
Once a functional dataset has been constructed the next step is to inspect the dataset for phase
and amplitude variation, see Figure 1.4. Often phase variation is not of primary interest and
many techniques in functional data analysis are designed only to account for amplitude variation;
so phase variation is usually removed using a procedure called curve registration. Phase variation
may arise because biological systems experience ‘local’ times that pass at slightly different rates
to the ‘clock’ time. For example a plot of the derivatives of the growth of children in the Berkeley
Growth study, shown in Figure 1.5, suggests that some children experience growth spurts earlier
than others and for some the growth spurt is longer; this means that computing statistics of the
untransformed data can give results that are not representative. Only a brief description of the
procedure is provided here; a thorough account is available in Ramsay (2011).
There are essentially three kinds of registration: (1) shift registration; (2) landmark registration; and, (3) continuous registration. Shift registration is the simplest: each curve is shifted
16
1
0
−4 −3 −2 −1
Derivative of height
2
CHAPTER 1. FUNCTIONAL DATA ANALYSIS
5
10
15
Time
Figure 1.5: Landmark Registration. This chart shows the first derivative of the heights of the
children plotted in Figure 1.1. It is adapted from the work of Ramsay et al. (2010)
.
to the left or right and no warping is required to ensure that key features align — this is the
procedure that would be applied to the synthetic data shown in the first panel of Figure 1.4.
Landmark registration requires both shifting and warping the data so that key features coincide
across all curves. This technique could be used for the data shown in Figure 1.5: the data could
be shifted and warped so that zero-crossings of all curves coincide. Continuous registration is
the most advanced approach, it uses the entire curve to align the data rather than simply the
location of landmarks — it is described in detail in Ramsay (2011).
1.5
A sample dataset
This thesis will illustrate the application and efficacy of the new techniques defined by examining
the phenomenon of stock market overreaction. The precise hypothesis investigated is whether
very large single day price declines in stock prices are more likely to be followed by period of
sustained price increase as compared to stock price behaviour after all other trading days. Early
progress on this topic was made by De Bondt & Thaler (1985) and Bremer & Sweeney (1991);
recent progress has been made by Cressy & Farag (2011) and Mazouz et al. (2009). In partcular,
Cressy & Farag (2011) suggest the phenomenon may be exploited to construct profitable trading
algorithms.
The reversal of sudden stock price decreases was amongst the first substantial evidence that
CHAPTER 1. FUNCTIONAL DATA ANALYSIS
17
stock markets are not Markovian. This finding suggests that Markovian models cannot capture
all aspects of the behaviour of stock prices. Since popular models for stock price evolution assume
the Markov property, including Black & Scholes (1973) and Heston (1993), this finding suggests
such models should only be applied cautiously.
Functional data analysis has frequently found application in econometrics, see for example,
Cai (2011), Aneiros et al. (2011) and Bugni et al. (2009). So the application presented here may
be seen as a continuation of that work.
To examine the phenomenon of stock market overreaction, I collected historical prices from
1982 to 2011 for all companies currently listed on the NASDAQ Stock Market. I then assigned
the data to two categories; (1) the 40 day price evolution of a stock immediately following a
single-day decline of 10% or more; and (2) the 40 day price evolution of a stock after trading
days where there was no decline of 10% or more. A subsample of the these two categories is
plotted in Figure 1.6. The goal now is to determine if there are statistically significant differences
between the two categories and to provide descriptions of any such differences.
There were some problems with the data, the most significant were: stocks that were no
longer trading as of 2011 were systematically excluded; and, the price history was only of daily
resolution. The analysis of this data should therefore only be seen as illustrative of the quantiles
developed in this thesis. A more precise analysis of stock price reversal is available in Cressy &
Farag (2011) or Mazouz et al. (2009); although they do not approach the problem through the
prism of functional data analysis.
Based on Figure 1.6, we can draw some preliminary conclusions about the data. First, the
dispersion of stock prices is much higher after declines of 10% or more. Second, both datasets
seem to have a positive skew, with the skew being greater after the 10% decline. These findings
are in line with the prior work of Cressy & Farag (2011) and Mazouz et al. (2009).
CHAPTER 1. FUNCTIONAL DATA ANALYSIS
18
2.0
2.0
1.5
1.5
1.0
1.0
0.5
0.5
0
10
20
30
40
0
10
20
30
40
Figure 1.6: The chart on the left shows a random selection of the 40 day trading history of
NASDAQ stocks after trading days where there was no single-day decline of 10% or more, 1982–
2011; right shows a random selection of 40 day trading history immediately following a single-day
decline of 10% or more.
Chapter 2
Multivariate quantiles
The goal of this chapter is to identify the difficulties of constructing multivariate quantiles and
describe methods by which some of these difficulties have been conquered. All definitions of
multivariate quantiles, of which I am aware, sacrifice some of the utility of univariate quantiles.
2.1
The failure of the ordinary approach
In the univariate case, the quantiles of random variable are defined by reference to its distribution
function:
Definition 2.1 (Univariate quantiles) If X : Ω → R is a real random variable with distribution function FX and α ∈ (0, 1) then the α-quantile of X is:
Qα = inf {x ∈ R : FX (x) ≥ α}
Where necessary the distribution of the random variable associated with each quantile will be
included as a superscript: QX
α.
So we could try to extend this definition to a bivariate distribution as follows:
Definition 2.2 (Potential bivariate quantiles) If X : Ω → R2 is a bivariate real random
variable with distribution function FX (x, y) and α ∈ (0, 1) then the α-quantile of X may potentially be defined as:
Qα = inf {(x, y) ∈ R2 : FX (x, y) ≥ α}
However, for this definition to be meaningful we must be able to interpret the infimum of subsets
of R2 ; that is for some set S ⊂ R2 there must be some unique element (x, y) ∈ R2 so that (x, y)
is the largest element of R2 that is also less than or equal to every element of S. The existence
of a unique infimum requires the existence of a total order:
19
CHAPTER 2. MULTIVARIATE QUANTILES
20
Definition 2.3 (Total orders) The set X is totally ordered under the relation ≤ iff for every
a1 , a2 , a3 ∈ X the following conditions are satisfied:
1. If a1 ≤ a2 and a2 ≤ a1 then a1 = a2 (antisymmetry);
2. If a1 ≤ a2 and a2 ≤ a3 then a1 ≤ a3 (transitivity); and
3. Either a1 ≤ a2 or a2 ≤ a1 (totality).
The natural choice for ordering R2 is the Euclidean norm:
(x1 , y1 ) ≤ (x2 , y2 ) iff
q
x21 + y12 ≤
q
x22 + y22
Unfortunately this is not a total order: (1, 2) ≤ (2, 1) and (2, 1) ≤ (1, 2) but (1, 2) 6= (2, 1), which
contradicts the antisymmetry property of Definition 2.3. It is possible to specify total orders for
R2 , for instance the lexicographical order:
(x1 , y1 ) ≤ (x2 , y2 ) iff x1 ≤ x2 or (x1 = x2 and y1 ≤ y2 )
But such orders are not satisfactory in practice because they do not capture intuitive notions of
size, for example: (2, 0) ≥ (1, 1000). So, in general, we cannot extend the standard definition
of univariate quantiles to multivariate distributions in a way that permits the specification of
unique quantiles and corresponding order statistics.
2.2
Assessing the quality of multivariate quantiles
Despite the difficulty of ordering the values of multivariate distributions, substantial progress has
been made in defining multivariate quantiles. Before examining some of the definitions proposed,
it is appropriate to identify properties good definitions of multivariate quantiles should posess.
Serfling (2002) suggests there are five such properties. Ideally, multivariate quantiles should:
1. permit probabilistic interpretations of empirical and distribution quantiles;
2. permit the description of location through measures analogous to univariate medians,
trimmed means and order statistics;
3. permit the description of dispersion through measures analogous to univariate interquartile
range;
4. display affine equivariance and related equivariance properties;
5. permit the construction and interpretation of test statistics for natural hypotheses on key
characteristics of datasets.
These properties will be used to judge the quality of both the definitions of multivariate quantiles
reviewed and the functional quantiles defined in the second part of this thesis.
CHAPTER 2. MULTIVARIATE QUANTILES
2.3
21
Methods for constructing multivariate quantiles
Two methods for constructing multivariate quantiles are reviewed: (1) quantiles based on minimising norms; and (2) the quantiles of Fraiman & Pateiro-López (2011) which are tailored for use
in functional data analysis. A more complete review of the variety of methods for constructing
multivariate quantiles is presented in Serfling (2002).
2.3.1
Norm minimisation
In this section we review the definition of quantiles proposed by Chaudhuri (1996). The method
proposed by Chaudhuri (1996) did not generalise Definition 2.1 but instead extended a separate
but equivalent definition of univariate quantiles based on norm minimisation:
Lemma 2.1 (Univariate quantiles alternative definition) If X is a real random variable
with quantiles {Qα }α∈(0,1) then for each α ∈ (0, 1) we must have:
Qα = arg min E [|X − x| + (2α − 1)(X − x)]
x∈R
Proof The proof consists of replacing |X − x| with (X − x)1{X > x} + (x − X)1{X ≤ x} then
differentiating with respect to x and setting the resulting expression equal to 0.
This definition may be extended to multivariate distributions as follows:
Definition 2.4 (Chaudhuri’s multivariate quantiles) Suppose X : Ω → Rd is a multivariate dimensional random variable with d ≥ 2, then the β-quantile of X is:
Qβ = arg min E [Φ (β, X − x) − Φ (β, X)]
x∈Rd
where Φ(a, b) = ||a||+ < a, b > for the Euclidean norm || · || and the dot product < ·, · > and
β ∈ (−1, 1) indexes the quantiles; β is the analogue of α for univariate quantiles.
Chaudhuri (1996) also suggests a methods to estimate these quantiles and establishes the consistency of such estimators. This definition does not ensure the existence of a unique quantile for
each β ∈ (−1, 1) and Serfling (2002) shows that such quantiles only partially satisfy each of the
desirable properties of quantiles described in Section 2.2; but they perform quite well compared
to other definitions.
2.3.2
Functional quantiles
In this section we review the definition of quantiles given in Fraiman & Pateiro-López (2011).
The definition proposed in Fraiman & Pateiro-López (2011) is very general: it is applicable to
all univariate, multivariate and functional random variables defined on Hilbert spaces. We only
CHAPTER 2. MULTIVARIATE QUANTILES
22
consider the definition applied to a real functional random variable defined on a space of square
integrable functions with a specific inner product.
Definition 2.5 (Fraiman & Pateiro-López’s fuctional quantiles) Let X : Ω → E be a
real functional random variable such that E ⊂ L2 , the space of square integrable functions.
Define the inner product for χ1 , χ2 ∈ E:
Z
< χ1 , χ2 >=
χ1 (t)χ2 (t)dt
t∈I
And the corresponding norm:
||χ1 || =
√
< χ1 , χ1 >
And the unit ball:
B = {χ ∈ E : ||χ|| = 1}
Then the α-quantile in the direction of χ ∈ B is:
−E(X ),χ>
Qα,χ = Q<X
+ E(X )
α
<X −E(X ),χ>
where Qα
is the univariate quantile of Definition 2.1.
Fraiman & Pateiro-López (2011) suggest these quantiles display location equivariance, and
equivariance under unitary operators and homogeneous scale transformations. Fraiman & PateiroLópez (2011) also suggest a consistent method of estimating these quantiles. They do not comment on whether the quantiles permit probabilist interpretations or the description of location
and dispersion of functional random variables; nor do they give methods for using these quantiles
to construct hypothesis tests.
2.4
Room for improvement?
Fraiman & Pateiro-López (2011) present a convincing and theoretically sound method for defining
and estimating quantiles for functional random variables. However, so far, these quantiles have
not be shown to share many of the desirable characteristics outlined in Section 2.2. As their
construction is quite complex, it is not immediately clear if it is possible to establish these
characteristics; so an alternative construction of functional quantiles may succeed if it can be
used more easily to describe and make inferences on functional datasets.
Part II
Quantiles for Functional Data
23
Chapter 3
Defining functional quantiles
The goal of this chapter is to introduce a definition of quantiles for real functional random
variables. A description of the properties of the quantiles introduced is provided and a method
to estimate them empirically is suggested. The quantiles developed in this chapter should only
be applied after the functional dataset has been constructed as described in Section 1.3 and
registered as described in Section 1.4.
3.1
Functional quantiles
First, we propose a reasonable definition of the quantiles of the distribution of a functional
random variable. To overcome the fact that, in general, no natural order can be specified for an
arbitrary set of functions defined on a a closed interval of the real line, we will reconstitute the
functions so that a natural order can be specified.
3.1.1
Defining functional quantiles
The univariate quantiles of Definition 2.1 can be extended to a real functional random variable,
{X (t)}t∈I , in a natural way by considering the marginal distribution at each t ∈ I. Since the
marginal distribution of X is a real random variable, Definition 2.1 can be applied directly:
Definition 3.1 (Functional quantiles) If X : Ω → E is a real functional random variable
with marginal distribution function FX (t) at t ∈ I then the α-quantile of X as:
Qα (t) = inf {x ∈ R : FX (t) (x) ≥ α} for t ∈ I
Where necessary the distribution of the random variable associated with each quantile is
incorporated via a superscript: QX
α.
24
CHAPTER 3. DEFINING FUNCTIONAL QUANTILES
25
We can think of an α-quantile under this definition as a piecewise-defined function, where
the definition of the function is exactly some χ ∈ E for every t ∈ I. One immediate consequence
of this definition is that functional quantiles are, appropriately, functions mapping I → R.
In Chapter 2 we saw that univariate quantiles are defined most easily in terms of the distribution function of a random variable. We should therefore consider whether there is an extension
of the distribution function for functional random variables as that might permit an alternative
construction of functional quantiles. Bugni et al. (2009) suggest one extension for real functional
random variables:
Definition 3.2 (Distribution functionals) If X : Ω → E, is a real functional random variable then the distribution functional FX is given by:
FX (χ) = P [X (t) ≤ χ(t) for all t ∈ I]
(3.1)
Theorem 3.1 A distribution functional FX uniquely determines a functional random variable,
X.
Proof See the discussion in (Bugni et al. 2009, Section 2.1).
Now we could try to define the quantiles of X as some function of the set of functions
{χ ∈ E : FX (χ) ≥ α}. Unfortunately, as for Rn , no natural order can be specified for must
function spaces, so it is not sensible to talk about its infimum as we would for univariate quantiles.
Ultimately, after considering variations on this approach for some time, I rejected the idea that
quantiles could be defined using a distribution functional. It is not surprising that the standard
approaches for univariate and finite-dimensional multivariate data analysis do not carry over well
to the functional case, other authors have made similar findings. For instance Delaigle & Hall
(2010) show, amongst other things, that there is no direct analogue of the probability density
function defined in terms of small-ball probabilites for functional random variables.
Often it is quite easy to find the quantiles of a functional random variable, by way of example,
the derivation of the quantiles of the standard Wiener process is provided in Theorem 3.2; a plot
the quantiles derived and several synthetically generated simulations of a Wiener process is shown
in Figure 3.1.
Theorem 3.2 The standard Wiener process has quantiles Qα (t) =
√
t Φ−1
0,1 (α) where Φµ,σ 2 de-
notes the distribution function of N (µ, σ 2 ), the normal distribution with mean µ, and variance
σ2 .
Proof Let W be a Wiener Process. Now by definition, W(t) ∼ N (0, t). So applying Definition
√ −1
2.1, we see the α-quantile of W(t) is Φ−1
t Φ0,1 (α).
0,t (α) =
CHAPTER 3. DEFINING FUNCTIONAL QUANTILES
26
1.5
1.0
0.5
0.0
-0.5
-1.0
-1.5
0.0
0.2
0.4
0.6
0.8
1.0
Time
Figure 3.1: 20 realisations of standard Wiener process with selected quantiles; quartiles are shown
W
in red and QW
0.05 (t) and Q0.95 (t) (which estimate the maximum and minimum of the sample) are
shown in green.
3.1.2
Properties of functional quantiles
The quantiles introduced in Section 3.1.1 mimic the behaviour of the functional random variable
to which they are applied reasonably closely: if a functional random variable is continuous then
so are its quantiles; if a functional random variable is differentiable then in many situations
its quantiles will be too. This section establishes several useful properties of quantiles. The
central result of this section is that quantiles do not, in general, characterise the distribution of
a functional random variable as shown in Theorem 3.4.
Theorem 3.3 If X is a continuous functional random variable then the quantiles of X are
continuous.
Proof Let X : Ω → E be a continuous functional random variable with quantiles {QX
α }α∈(0,1) .
Then by Definition 1.10, E ⊂ C 0 . Now it suffices to show that for any > 0 and c ∈ I there
X
exists some δ > 0 such that if |t − c| < δ then |QX
α (t) − Qα (c)| < . Pick > 0, c ∈ I and
α ∈ (0, 1). Now define F = {χγ }γ∈G so that if γ ∈ G then χγ ∈ E and χγ (c) = QX
α (c). Now if
{χγ }γ∈G has only one element then by Definition 3.1, in a neighbourhood of c: QX
α = χγ which
is, by definition of X , a continuous function, so there must exist δ > 0 such that if |t − c| < δ
X
then |QX
α (t) − Qα (c)| < .
Now suppose {χγ }γ∈G has more than one element. Because X is continuous and QX
α (t) = χ(t)
for some χ ∈ E, by Definition 3.1; there must exist some χ1 , χ2 ∈ {χγ }γ∈G such that there is
X
some δ1 , δ2 > 0 for which QX
α (t) = χ1 (t) if t ∈ (c − δ1 , c) and Qα (t) = χ2 (t) if t ∈ (c, c + δ2 ).
Now since χ1 , χ2 ∈ E, χ1 and χ2 are continuous functions, and since χ1 (c) = χ2 (c), if we ensure
X
δ ≤ min{δ1 , δ2 }, then we can find δ such that: if |t − c| < δ then |QX
α (t) − Qα (c)| < .
CHAPTER 3. DEFINING FUNCTIONAL QUANTILES
27
Theorem 3.4 Quantiles do not characterise the distribution of a functional random variable.
Proof It suffices to show there exist two functional random variables, X and Y such that
Y
QX
α (t) = Qα (t) if α ∈ (0, 1) and X Y. Let χ1 and χ2 be non-random functions mapping
I → R, such that for some t1 ∈ I:
χ1 (t1 ) = χ2 (t1 )
Define X :
Then define Y:
χ1 (t) < χ2 (t)
if t < t1
χ1 (t) > χ2 (t)
if t > t1

χ (t) with probability 0.5
1
X (t) =
χ (t) with probability 0.5
2

max {χ (t), χ (t)}
1
2
Y(t) =
min {χ (t), χ (t)}
1
2
with probability 0.5
with probability 0.5
Now P(X = χ1 ) = 0.5 6= 0 = P(Y = χ1 ) so X Y, but applying Definition 3.1 we find the
quantiles of X and Y are:

max {χ (t), χ (t)}
1
2
Y
QX
(t)
=
Q
(t)
=
α
α
min {χ (t), χ (t)}
1
2
if α > 0.5
if α ≤ 0.5
A concrete example of a set of distinct functional random variables which share quantiles in
Figure 3.2.
Contemplation of this result and in particular contemplation of Figure 3.2 suggests that the
information lost when a functional random variable is reduced to its quantiles is knowledge about
which branch it takes at intersections of realisations of the functional random variable.
3.1.3
Characterising a functional random variable
Characterisation is a property of almost all definitions of quantiles. Univariate quantiles determine exactly the distribution function of a random variable and therefore characterise the
distribution. Similarly the multivariate quantiles of Chaudhuri (1996) and the functional quantiles of Fraiman & Pateiro-López (2011) characterise the distribution of the random variables to
which they are applicable. Our goal in this section is to prove that quantiles characterise the
distribution of a functional random variable if certain conditions are satisfied.
Standard assumptions in functional data analysis are continuity, differentiability and monotonicity. However we saw in Figure 3.2 that continuity is not a sufficient condition and we can
see in Figures 3.3 and 3.4 that monotonicity and differentiability are not sufficient either.
CHAPTER 3. DEFINING FUNCTIONAL QUANTILES
28
1.5
1.5
1.5
1.0
1.0
1.0
0.5
0.5
0.5
0.0
0.0
0.5
1.0
1.5
2.0
2.5
3.0
0.5
1.0
1.5
2.0
2.5
0.0
3.0
-0.5
-0.5
-0.5
-1.0
-1.0
-1.0
-1.5
-1.5
-1.5
0.5
1.0
1.5
2.0
2.5
3.0
Figure 3.2: An example of a pair of functional random variables which reduce to the same
quantiles. The left and middle charts show the equiprobable values of the functional random
variables; the right chart shows the quantiles which are shared by both random variables.
15
15
15
10
10
10
5
5
5
0.5
1.0
1.5
2.0
2.5
3.0
0.5
1.0
1.5
2.0
2.5
3.0
0.5
1.0
1.5
2.0
2.5
3.0
Figure 3.3: The figure shows that monotonicity is not sufficient to ensure that a functional
random variable is characterised by its quantiles. It follows the same schema as Figure 3.2.
1.0
1.0
1.0
0.5
0.5
0.5
0.5
1.0
1.5
2.0
2.5
3.0
0.5
1.0
1.5
2.0
2.5
3.0
0.5
-0.5
-0.5
-0.5
-1.0
-1.0
-1.0
1.0
1.5
2.0
2.5
3.0
Figure 3.4: This figure shows that differentiability is not sufficient to ensure that a functional
random variable s characterised by its quantiles. It follows the same schema as Figure 3.2.
CHAPTER 3. DEFINING FUNCTIONAL QUANTILES
29
So what are sufficient conditions to ensure that quantiles characterise the distribution of a
functional random variable? We observed in the previous section that the information lost when
a functional random variable is reduced to its quantiles is knowledge about which path is taken
at branch points of different realisations; so if we make assumptions about the behaviour of the
functional random variable at all branch points we should be able to show that quantiles are
characteristic. First some definitions are required:
Definition 3.3 (Stacked, crossing and tangled) Let X : Ω → E be a real functional
random variable. And let χ1 , χ2 be arbitrary elements of E; so χ1 and χ2 are functions
mapping I → R.
1. X is stacked iff for every χ1 , χ2 ∈ E, either χ1 ≥ χ2 or χ1 ≤ χ2 . Where χ1 ≤ (≥)χ2
iff for every t ∈ I, χ1 (t) ≤ (≥)χ2 (t).
2. X is crossing iff for every χ1 , χ2 ∈ E and t ∈ I such that χ1 (t) = χ2 (t) there is some
δ > 0 such that:
(i) χ1 (x) ≤ (≥)χ2 (x) if x ∈ (t − δ, t);
(ii) There is at least one x ∈ (t − δ, t) such that χ1 (x) < (>)χ2 (x);
(iii) χ1 (x) ≥ (≤)χ2 (x) if x ∈ (t, t + δ); and
(iv) There is at least one x ∈ (t, t + δ) such that χ1 (x) > (<)χ2 (x)
3. X is tangled iff it is neither stacked nor crossing.
To provide intuition about this definition, examples of selected stacked, crossing and tangled
functional random variables are plotted in Figures 3.5 and 3.6. The central idea is that all the
values of a stacked functional random variable can be placed in a stack without having to push
one function through another. Crossing is a complementary concept: if two values of a crossing
functional random variable are equal at a point, then they must push through each other at the
point. Note that a functional random variable, X : Ω → E, does not need to be continuous for
either property to be satisfied, nor does E need to be finite; functional random variables can also
be both stacked and crossing.
First we consider a consequence of the definition of stacked functional random variables.
This result is used as stepping stones to show that all stacked functional random variables are
characterised by their quantiles.
CHAPTER 3. DEFINING FUNCTIONAL QUANTILES
2.0
2.0
1.5
1.5
1.0
1.0
0.5
0.5
0.0
0.0
-0.5
-0.5
0.0
0.2
0.4
0.6
0.8
1.0
0.0
30
0.2
0.4
0.6
0.8
1.0
Figure 3.5: The chart on the left plots every χ ∈ E for a stacked functional random variable
X : Ω → E; the chart on the right gives an example of a tangled functional random variable
that is almost stacked, the points marked by arrows are the points at which the stacked property
fails.
2.0
2.0
1.5
1.5
1.0
1.0
0.5
0.5
0.0
0.0
-0.5
-0.5
0.0
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
0.6
0.8
1.0
Figure 3.6: The figure follows a similar to schema to that of Figure 3.5. Left shows every χ ∈ E
for a crossing functional random variable and right shows a tangled functional random variable
that is almost crossing.
CHAPTER 3. DEFINING FUNCTIONAL QUANTILES
31
Lemma 3.5 Let X : Ω → E be a stacked real functional random variable with quantiles,
{Qα }α∈(0,1) . Then E = {Qα }α∈(0,1) .
Proof It is enough to show (i) if α ∈ (0, 1) then Qα = χ for some χ ∈ E; and (ii) if χ ∈ E then
χ = Qα for some α ∈ (0, 1).
First we establish (i): pick α ∈ (0, 1). By Definition 3.1, Qα (t) = inf {x ∈ R : FX (t) (x) ≥ α}.
We need to show there is some χ ∈ E such that χ(t) = Qα (t) for every t ∈ I. Suppose, to get a
contradiction, there is no such χ ∈ E. Then there must exist χ1 , χ2 ∈ E such that χ1 (t) 6= χ2 (t)
if t ∈ {t1 , t2 } ⊂ I, Qα (t1 ) = χ1 (t1 ) and Qα (t2 ) = χ2 (t2 ). Assume without loss of generality
that χ1 (t1 ) > χ2 (t1 ). Then to ensure χ2 (t2 ) = Qα (t2 ) and χ1 (t2 ) 6= χ2 (t2 ) we must have
χ1 (t2 ) < χ2 (t2 ) . This is a contradiction of the assumption that X is stacked so our supposition
that there was no χ ∈ E such that χ(t) = Qα (t) for every t ∈ I was false.
Now we consider part (ii): we need to show that if χ ∈ E then there is some α ∈ (0, 1) such
T
γ∈I aγ .
that Qα = E. Pick χ ∈ E. Let aγ = {α ∈ (0, 1) : Qα (tγ ) = χ(tγ )}. Then define a =
Then we claim a 6= ∅ and Qα = χ if α ∈ a.
We now have sufficient knowledge to prove the first central theorem of this thesis: that all
stacked functional random variable are characterised by their quantiles.
Theorem 3.6 (Quantile characterisation theorem – stacked) If (Ω, F, P) is a probability space and X : (Ω, F) → (E, E) is a stacked real functional random variable, then it
is uniquely determined by its quantiles: {Qα }α∈(0,1) .
Proof (nonconstructive) First we give a nonconstructive proof. Let X and Y be stacked
functional random variables sharing the quantiles {Qα }α∈(0,1) . It suffices to show that
X ∼ Y.
Suppose, to get a contradiction, that X Y. Then there must exist some set F ∈ E
such that P(X ∈ F ) 6= P(Y ∈ F ). Let Ft1 = {χ(t1 )}χ∈F for some t1 ∈ I. Now since
X and Y are stacked, to ensure P(X ∈ F ) 6= P(Y ∈ F ) there must exist t1 such that
P(X (t1 ) ∈ Ft1 ) 6= P(Y(t1 ) ∈ Ft1 ). But Definition 3.1 ensures the quantiles of a functional
random variable determine its marginal distribution exactly; so the existence of such a t1 is
a contradiction of the assumption that X and Y share quantiles.
Proof (constructive) Next we give a constructive proof. This proof requires the additional assumption that functional random variables are continuous.
Let X : Ω → E be a stacked continuous functional random variable with quantiles
{Qα }α∈(0,1) . It is sufficient to show (i) we can recover the set E and (ii) we can measure
any F ∈ E with respect to P using only information contained in {Qα }α∈(0,1) .
First, by Lemma 3.5, {Qα }α∈(0,1) is identically E, so (i) is satisfied.
Now consider (ii). Pick F ∈ E. Since X is continuous, every χ1 , χ2 ∈ F can be distinguished by their values on the countable dense set IQ = {q ∈ Q : q ∈ I}. So P(X ∈ F ) may
CHAPTER 3. DEFINING FUNCTIONAL QUANTILES
32
be computed inductively from the marginal distributions of X at each t ∈ IQ . To make this
explicit, pick t0 ∈ IQ and consider the set Et0 = {χ(t0 )}χ∈E . A point y ∈ Et0 must satisfy
one and only one of the following conditions:
1. there is some χ ∈ F such that χ(t0 ) = y and there is some χ ∈ F c such that χ(t0 ) = y;
2. there is some χ ∈ F such that χ(t0 ) = y and there is no χ ∈ F c such that χ(t0 ) = y;
and
3. there is no χ ∈ F such that χ(t0 ) = y and there is some χ ∈ F c such that χ(t0 ) = y.
Let y10 , y20 and y30 be the set of all y ∈ Et0 satisfying the first, second and third conditions
respectively. Now {χ ∈ E : χ(t0 ) ∈ y20 } ⊂ F by definition of y20 and {χ ∈ E : χ(t0 ) ∈ y30 } ⊂
F c by definition of y30 so let P0 (F ) = P(χ(t0 ) ∈ y20 ), let E1 = E\{χ ∈ E : χ(t0 ) ∈ y20 ∪ y30 }
and let IQ1 = IQ \{t0 }.
Now using induction, assume Pn (F ), En , and IQn are known then pick tn+1 ∈ IQn so we
can define corresponding sets to y10 , y20 and y30 :
y1n+1 = {{χ(tn+1 )}χ∈En : χ(tn+1 ) ∈ ({χ(tn+1 )}χ∈F ∩ {χ(tn+1 )}χ∈F c )}
y2n+1 = {{χ(tn+1 )}χ∈En : χ(tn+1 ) ∈ {χ(tn+1 )}χ∈F ∩ ({χ(tn+1 )}χ∈F c )c }
y3n+1 = {{χ(tn+1 )}χ∈En : χ(tn+1 ) ∈ ({χ(tn+1 )}χ∈F )c ∩ {χ(tn+1 )}χ∈F c }
So {χ ∈ E : χ(tn+1 ) ∈ y2n+1 } ⊂ F and and {χ ∈ E : χ(tn+1 ) ∈ y3n+1 } ⊂ F c . Now define
Pn+1 (F ) = P(χ(tn+1 ) ∈ y2n+1 ), let En+1 = En \{χ ∈ En : χ(tn+1 ) ∈ y2n+1 ∪ y3n+1 } and let
IQn+1 = IQn \{tn+1 }.
So we can compute the sequence {P0 (F ), P1 (F ), . . .}. Now the sum of this sequence is
exactly:
P(X ∈ F ) =
∞
X
Pi (F )
i=0
Now we turn to the case of crossing functional random variables. Before a theorem can be
stated we need one more technical definition and two more lemmas.
Definition 3.4 (Smoothing quantiles) A real functional random variable X : Ω → E with
quantiles {Qα }α∈(0,1) does not have smoothing quantiles iff for every α ∈ (0, 1) and t1 ∈ I
such that Q is differentiable at t1 , there is some χ ∈ E and δ > 0 such that Qα (t) = χ(t) if
t ∈ (t1 − δ, t1 + δ).
CHAPTER 3. DEFINING FUNCTIONAL QUANTILES
33
The functional random variables shown in Figure 3.4 have smoothing quantiles, all other examples
of functional random variables depicted in this thesis do not.
Lemma 3.7 If a real functional random variable X : Ω → E is continuously differentiable and
there are no functions χ1 , χ2 ∈ E, and t ∈ I such that χ1 (t) = χ2 (t) and (∂χ1 )(t) = (∂χ2 )(t)
then X is crossing.
Proof Assume X : Ω → E is a continuously differentiable functional random variable and there
are no functions χ1 , χ2 ∈ E, and t ∈ I such that χ1 (t) = χ2 (t) and (∂χ1 )(t) = (∂χ2 )(t). Pick
χ1 , χ2 ∈ E and t ∈ I such that χ1 (t) = χ2 (t). It suffices to show that there is some δ > 0
satisfying the four limbs of Definition 3.3(2).
Now by assumption (∂χ1 )(t) 6= (∂χ2 )(t), so assume without loss of generality that (∂χ1 )(t) >
(∂χ2 )(t). Now since χ1 and χ2 are differentiable, χ1 (t) = χ2 (t) and (∂χ1 )(t) > (∂χ2 )(t) there
must exist δ > 0 such that χ1 (x) ≤ χ2 (x) if x ∈ (t − δ, t) and χ1 (x) ≥ χ2 (x) if x ∈ (t, t + δ)
so (i) and (ii) are satisfied. Now since χ1 (t) = χ2 (t) and (∂χ1 )(t) > (∂χ2 )(t) there must exist
x ∈ (t − δ, t) such that χ1 (x) < χ2 (x) and x ∈ (t, t + δ) such that χ1 (x) > χ2 (x) so (ii) and (iv)
are satisfied and the result follows.
Lemma 3.8 If X : Ω → E is a continuously differentiable functional random variable with
quantiles {Qα }α∈(0,1) , then for every α ∈ (0, 1), the function Qα is differentiable except for at
most countably many points.
Proof We will see that disjoint open intervals can be placed around every point of nondifferentiability. This suffices because every open interval contains a rational number and any
subset of the rationals is at most countable.
Let t1 and t2 be members of the set I such that t1 < t2 and Qα is non-differentiable at t1
and t2 . Now it suffices to show that for every such t1 , t2 we can find t such that t1 < t < t2 as
this permits the specification of the disjoint open intervals (·, t) and (t, ·) containing t1 and t2 .
Suppose, to get a contradiction, that there is no t such that t1 < t < t2 . Now by the assumption
that X is continuously differentiable, to ensure non-differentiability of Qα at t1 and t2 , there
must exist χ1 , χ2 ∈ E such that
χ1 (t1 ) = Qα (t1 )
χ1 (t2 ) 6= Qα (t2 )
χ2 (t2 ) = Qα (t2 )
χ2 (t1 ) 6= Qα (t1 )
But since Qα is continuous (as a consequence of Theorem 3.3) this implies that χ1 and χ2
are discontinuous; but since X is continuous, χ1 and χ2 must be continuous and we have our
contradiction. So there must exist t such that t1 < t < t2 . Similar results to this lemma are
obtained in Goswami & Rao (2004) and Froda (1929).
CHAPTER 3. DEFINING FUNCTIONAL QUANTILES
34
Now we have sufficient knowledge to prove the second central theorem of this thesis: that
some crossing functional random variables are characterised by their quantiles.
Theorem 3.9 (Quantile characterisation theorem – crossing) If (Ω, F, P) is a probability space and X : (Ω, F) → (E, E) is a real functional random variable such that:
1. X is continuous
2. There are no χ1 , χ2 ∈ E and t ∈ I such that χ1 (t) = χ2 (t) and (∂χ1 )(t) = (∂χ2 )(t);
and
3. X does not have smoothing quantiles.
Then X is uniquely determined by its quantiles {Qα }α∈(0,1) . Note that conditions 1 and 2
imply X is crossing as a consequence of Lemma 3.7.
Proof It suffices to show that: (i) we can construct every χ ∈ E and; (ii) we can measure
any F ∈ E with respect to P. The proof of (ii) is identical to constructive proof given for
Theorem 3.6; so we will only prove (i).
Pick χ ∈ E. Let t0 = min(I). Now by Definition 3.1, χ(t0 ) = Qα (t0 ) for some α ∈ (0, 1).
Let such an α be α0 . Now let t1 = sup{t ∈ I : Qα0 (t) ∈ C 1 if t ∈ [t0 , t1 )}. Then by the
assumption that X does not have smoothing quantiles, χ(t) = Qα0 (t) if t ∈ [t0 , t1 ]. Now
choose α1 ∈ (0, 1) so that the function:

Q (t)
α0
Qα0 ,α1 (t) =
Q (t)
α1
if t ∈ [t0 , t1 )
if t ≥ t1
is differentiable if t ∈ (t0 , t1 ]. Such an α1 must exist because X is continuously differentiable.
Define a1 as the set of all α1 satisfying this condition.
There is no guarantee that {Qα }α∈a1 contains only one function. However for any α ∈ a1
and for some χ2 ∈ E the function Qα0 ,α1 (t) = χ2 (t) for t ∈ [t0 , t1 ] so we are simply
constructing multiple elements of E simultaneously; so this is no barrier to the reconstruction
of E.
Now we apply a similar procedure to choose t2 and a2 and define:



Q (t)

 α0
Qα0 ,α1 ,α2 (t) = Qα1 (t)



Q (t)
α2
if t ∈ [t0 , t1 )
if t ∈ [t1 , t2 )
if t ≥ t2
CHAPTER 3. DEFINING FUNCTIONAL QUANTILES
35
Then {χ(t)}t∈[t0 ,t2 ] ∈ {Qα0 ,α1 ,α2 (t)}α1 ∈a1 ,α2 ∈a2 ,t∈[t0 ,t2 ] again by the assumption that X
does not have smoothing quantiles. By Lemma 3.8, Qα has only countable many points
of non-differentiability for any α ∈ (0, 1), so we can apply this procedure inductively to
reconstruct χ over its entire domain.
3.2
Empirical functional quantiles
The goal of this section is to describe how the functional quantiles of Definition 3.1 can be
estimated empirically. A natural definition of empirical quantiles is given in Section 3.2.1; and
bias and consistency results for these empirical quantiles are obtained in Section 3.2.2.
3.2.1
Defining empirical quantiles
As before, we turn to the univariate case to find inspiration to define functional empirical quantiles. Unfortunately this is not entirely straightforward because there are a multiplicity of definitions for empirical univariate quantiles; in fact, the R Development Core Team (2011) suggest
there are at least nine ways to estimate univariate quantiles empirically. We will examine only
one simple method. First define the empirical distribution function:
Definition 3.5 Let {X1 , . . . , Xn } be a sample generated by the real random variable X. Then
the empirical distribution function of X is:
n
F̂X (t) =
1X
1{Xi ≤ t}
n i=1
The empirical distribution function permits an elegant definition of empirical quantiles:
Definition 3.6 Let {X1 , X2 , . . . , Xn } be a sample generated by the real random variable X and
let F̂X (x) be the empirical distribution function of X based on {X1 , . . . , Xn }. Then the empirical
α-quantile of X is:
Q̂α = inf {x ∈ R : F̂X (x) ≥ α}
So the univariate empirical quantile can be constructed simply by replacing FX with F̂X in
Definition 2.1. A natural extension for functional quantiles is obtained by applying an identical
substitution to Definition 3.1:
Definition 3.7 (Empirical functional quantiles) Let X be a real functional random
variable and let {X1 , X2 , . . . , Xn } be a functional dataset generated by X . Let F̂X (t) be the
marginal empirical distribution function at t ∈ I based on {X1 (t), X2 (t), . . . , Xn (t)}.
CHAPTER 3. DEFINING FUNCTIONAL QUANTILES
1.4
1.3
1.2
1.1
1.0
0.9
0.8
0.7
36
1.4
1.3
1.2
1.1
1.0
0.9
0.8
0.7
10
20
30
40
10
20
30
40
Figure 3.7: The figure on the left shows a sample of 3 stocks after a large price decline and the
figure on the right superimposes the empirical functional median, Q̂0.5 (t), of Definition 3.7.
Then the empirical α−quantile of the sample {X1 , X2 , . . . , Xn } is:
Q̂α (t) = inf {x ∈ R : F̂X (t) (x) ≥ α} for t ∈ I
This definition is easier to work with if it is expressed slightly differently (although the
similarity of Definitions 3.1 and 3.7 is concealed):
Lemma 3.10 If {X1 , X2 , . . . , Xn } is a functional dataset generated by a functional random variable with quantiles {Qα }α∈(0,1) and assume {X(1) (t), X(2) (t), . . . , X(n) (t)} are the order statistics
of {X1 (t), X2 (t), . . . , Xn (t)} then Definition 3.7 is equivalent to:
Q̂α (t) = X(dnαe) (t) for t ∈ I
Proof This is an immediate consequence of Definition 3.5 and Definition 3.7.
An example of the application of this definition to a small functional dataset is shown in
Figure 3.7. In Figure 3.8 the definition is applied to the full NASDAQ stock price dataset
introduced in Section 1.5.
3.2.2
Properties of empirical quantiles
This section develops bias and consistency results for the empirical quantiles defined in Definition
3.7. We will see that if certain weak conditions are satisfied then {Q̂α }α∈(0,1) are consistent
estimates for the functional quantiles of Definition 3.1 although they are usually biased estimates
for finite sample sizes. The proofs presented in this section are a relatively straightforward
extension of results obtained for univariate quantiles and order statistics. David (1970) was the
CHAPTER 3. DEFINING FUNCTIONAL QUANTILES
1.4
1.4
1.3
1.3
1.2
1.2
1.1
1.1
1.0
1.0
0.9
0.9
0
10
20
30
40
0
37
10
20
30
40
Figure 3.8: Empirical functional quantiles. This figure plots the quartiles of the NASDAQ stock
price data calculated according to Definition 3.7. This plot may be compared with Figure 1.6.
reference consulted for the univariate results. Some continuity and differentiability properties
for functional quantiles are also established.
To compute bias results exactly, we must make assumptions about the marginal distribution
of the functional random variable. We consider a functional random variable with marginal
uniform distribution; these results will assist in establishing general asymptotic results.
Lemma 3.11 (Marginal uniform distribution bias) If {X1 , X2 , . . . , Xn } is a functional dataset
generated by a functional random variable with marginal uniform distribution (that is X (t) ∼
U[0, 1] for t ∈ I) then {Q̂α }α∈(0,1) are biased estimates of {Qα }α∈(0,1) .
Proof It suffices to show E(Q̂α (t)) 6= Qα (t). So pick α ∈ (0, 1) and t ∈ I. First define the
function fdnαe as the density function of X(dnαe) (t). And compute:
E Q̂α (t) = E X(dnαe) (t)
Z ∞
=
xfdnαe (x)dx
−∞
Z ∞
dnαe−1 n−dnαe
n−1
=n
x FX (t) (x)
1 − FX (t) (x)
dx
dnαe − 1
−∞
Z 1
n−1
n−dnαe
=n
x xdnαe−1 [1 − x]
dx
dnαe − 1
0
n−1
n dnαe−1
=
n
(n + 1) dnαe
dnαe
n+1
(n + 1)α
<
=α
n+1
=
The first two equalities follow by definition, the third uses the following result: the probability
that an order statistics X(r) (t) falls within the small ball (x − dx, x + dx) is exactly equal to the
CHAPTER 3. DEFINING FUNCTIONAL QUANTILES
38
probably that X (t) falls within that small ball, exactly r − 1 of the n observations are below x
and the remainder are above x.
Lemma 3.12 (Marginal uniform distribution consistency) If {X1 , X2 , . . . , Xn } is a functional dataset generated by a functional random variable with marginal uniform distribution then
{Q̂α }α∈(0,1) are
1
n
consistent estimators of {Qα }α∈(0,1) .
Proof Pick α ∈ (0, 1) and t ∈ I. Now it suffices to show E(Q̂α (t)) → Qα (t) as n → ∞ and
var(Q̂α (t)) = O n1 . The first result is a consequence of Lemma 3.11 since dnαe
n+1 → α and
Qα (t) = α by the assumption that X has marginal uniform distribution. To establish the second
we need to find the second moment of the empirical quantile:
E Q̂α (t)2 = E X(dnαe) (t)2
Z ∞
x2 fdnαe (x)dx
=
−∞
Z 1
n−1
n−dnαe
x2 xdnαe−1 [1 − x]
dx
dnαe − 1
0
dnαe(1 + dnαe)
=
(2 + n)(1 + n)
=n
Now the variance is:
2
var(Q̂α (t)) = E Q̂α (t)2 − E Q̂α (t)
2
dnαe
dnαe(1 + dnαe)
−
=
(2 + n)(1 + n)
n+1
(1 + n − dnαe)dnαe
=
(1 + n)2 (2 + n)
(1 + n + n)n
≤
(1 + n)2 (2 + n)
1
=O
n
So the empirical quantiles are
1
n
consistent for the functional quantiles when X has a marginal
uniform distribution.
Now we can establish asymptotic results for general functional random variables.
Theorem 3.13 (Asymptotic Bias) Let {X1 , X2 , . . . , Xn } be a functional dataset generated by
a real functional random variable. Assume the marginal distribution of X at t ∈ I is continuous
and Qα (t) can be approximated arbitrarily closely by its Taylor series expansion in α. Then Q̂α
is an asymptotically unbiased estimator of Qα .
CHAPTER 3. DEFINING FUNCTIONAL QUANTILES
39
Proof We need to show that for each α ∈ (0, 1) we have E(Q̂α ) → Qα as n → ∞. Pick t ∈ I
and α ∈ (0, 1), Now E(Q̂α (t)) = X(dnαe) (t) by Definition 3.7 and X(dnαe) (t) can be approximated
by computing the Taylor series expansion of a suitable transformation of a functional random
variable with a marginal uniform distribution. Denote the distribution function of X (t) as FX (t) .
Since X (t) is continuous there must exist some function FX−1(t) such that (FX−1(t) ◦ FX (t) )(x) = x.
Note that FX−1(t) (α) is none other than Qα (t) by Defintion 3.1.
Now assume Y is a real functional random variable with marginal uniform distribution, then
FX (t) (X(dnαe) ) ∼ Y(dnαe) , so we must have:
X(dnαe) (t) ∼ FX−1(t) FX (t) (X(dnαe) (t)) ∼ FX−1(t) Y(dnαe) (t) .
Then we construct a Taylor series expansion about E(Y(dnαe) (t)) =
X(dnαe) (t) ∼ FX−1(t)
dnαe
n+1
dnαe
n+1
(from Lemma 3.11).
dnαe
dnαe
+ Y(dnαe) (t) −
(FX−1(t) )0
+ ...
n+1
n+1
Now if we compute the expectation of both sides:
E(Q̂α (t)) = E X(dnαe) (t)
1
dnαe
−1
+O
= E FX (t)
n+1
n
dnαe
1
= FX−1(t)
+O
n+1
n
Now since FX−1(t)
dnαe
n+1
→ FX−1(t) (α) = Qα (t) as n → ∞, the proof is complete.
Theorem 3.14 (Consistency) Let {X1 , X2 , . . . , Xn } be a functional dataset generated by a real
functional random variable. Assume the marginal distribution of X at t ∈ I is continuous and
Qα (t) can be approximated arbitrarily closely by its Taylor series in α. Then {Q̂α }α∈(0,1) are
1
n
consistent estimators for {Qα }α∈(0,1) .
Proof Pick α ∈ (0, 1) and t ∈ I. In Theorem 3.13 we established that E(Q̂α ) → Qα as n → ∞
so it suffices to show var(Q̂α ) = O n1 .
Applying the same argument as we used in the proof Theorem 3.13 we compute E(Q̂α (t)2 ):
E(Q̂α (t)2 ) = E(X(dnαe) (t)2 )
"
2 #
dnαe
dnαe
dnαe
−1
−1 0
= E FX (t)
+ Y(dnαe) (t) −
(FX (t) )
+ ...
n+1
n+1
n+1
"
2 #
dnαe
1
−1
= E FX (t)
+O
n+1
n
CHAPTER 3. DEFINING FUNCTIONAL QUANTILES
=
FX−1(t)
dnαe
n+1
2
40
1
+O
n
So var(Q̂α (t)) is:
2
var(Q̂α (t)) = E(Q̂α (t)2 ) − E Q̂α (t)
2
2
1
dnαe
dnαe
−1
−1
= FX (t)
− FX (t)
+O
n+1
n+1
n
1
=O
n
So Q̂α (t) is
1
n
consistent for Qα (t).
Theorem 3.15 Let {X1 , X2 , . . . , Xn } be a functional dataset generated by a real continuously
differentiable functional random variable. Then every Q̂α ∈ {Q̂α }α∈(0,1) is continuous at every
t ∈ I and differentiable except for at most countably many points.
Proof This result is established by adapting the proofs of Theorem 3.3 and Lemma 3.8.
3.2.3
Missing data
Often a functional datum may be unobserved only over some of its domain; indeed, Bugni
(2010) suggests this is a frequent problem in functional data analysis. Fortunately, if the data
are missing completely at random the empirical functional quantiles of Definition 3.7 can be
extended relatively easily. First a precise description of the nature of missing data addressed is
required:
Definition 3.8 (Partially observed functional datasets) A partially observed functional dataset
is a functional dataset {X1 , X2 , . . . , Xn } such that for each i ∈ {1, . . . , n} the functional datum
Xi is observed only over the set ti ⊂ I and there is at least one i ∈ {1, . . . , n} such that ti ( I;
and each ti is drawn completely at random from I.
Now we can adapt Definition 3.7 to the functional datasets of the previous definition.
Definition 3.9 (Empirical functional quantiles – Missing Data) Let X be a real
functional random variable and let {X1 , X2 , . . . , Xn } be a partially observed functional dataset
generated by X . Now denote the empirical distribution function F̂X (t) (x) as:
F̂X (t) (x) =
X
1
1{Xi (t) ≤ t}
|{i : t ∈ ti }|
{i:t∈ti }
CHAPTER 3. DEFINING FUNCTIONAL QUANTILES
41
Then the empirical α−quantile of the sample {X1 , X2 , . . . , Xn } is:
Q̂α (t) = inf {x ∈ R : F̂X (t) (x) ≥ α} for t ∈ I
Estimators defined in this way are also asymptotically unbiased and
1
m
consistent under
appropriate conditions where m = min{t ∈ I : |{i ∈ {1, . . . n} : t ∈ ti }|}; that is, m is equal to
the least number of observations of {X (t) : t ∈ I}. However, this adjustment does come at cost:
Theorem 3.15 is no longer true, as discontinuities may arise when we exit or enter the domain
over which a functional datum is observed.
Chapter 4
Applications
The goal of this chapter is to demonstrates how the quantiles defined in the previous chapter can
be used in practice. Three applications are suggested: describing functional datasets (Section
4.1); detecting outliers (Section 4.2); and constructing functional qq-plots (Section 4.3). Each of
these applications is a natural extension of methods developed for use with univariate quantiles.
These applications are also evidence of the quality of the quantiles defined according to the
criteria proposed by Serfling (2002) as outlined in Section 2.2.
4.1
Describing functional datasets
One of the goals of statistics is to provide tools for describing datasets efficiently. In the univariate
case, four characteristics are often measured: location, dispersion, skewness and kurtosis. We
will now see that the functional quantiles introduced in Definition 3.1 can be used to measure
each of location, dispersion and skewness. In principle, it is also possible to measure kurtosis,
but there does not seem to be a single, widely accepted quantile–based measures of kurtosis
(see for example, Groeneveld 1998 and Moors 1988) so it is not immediately clear which of the
multiplicity of measures should be adapted for use with functional quantiles.
In this section, the functional random variables shown in Figure 4.1 are used as a template
to illustrate the methods for describing functional random variables.
4.1.1
Location
In univariate data analysis, the median is a standard measure of the location of a dataset or
distribution; so, it is reasonable to attempt to measure the location of a functional dataset or
distribution using the functional median or the functional empirical median. In Figure 4.2 we
plot the medians of the functional random variables shown in Figure 4.1. Some meaningful
conclusions can be drawn from this measure: the green distribution is stationary, while the red
and blue distributions display linear growth.
42
CHAPTER 4. APPLICATIONS
43
1.2
0.2
1.0
0.4
-0.5
0.8
0.2
-1.0
0.6
0.4
0.6
0.8
1.0
-1.5
0.2
0.4
0.4
0.6
0.8
1.0
-0.2
0.2
-2.0
-2.5
-0.4
0.2
0.4
0.6
0.8
-3.0
1.0
Figure 4.1: Data generated by a Wiener Processes: Wr,σ2 (t). From left to right: r = 1 and
σ 2 = 1; r = 0 and, σ 2 = 2; and r = −2 and σ 2 = 3.
1
1.0
0.5
0
0.0
-0.5
-1
-1.0
-2
-1.5
-2.0
0.0
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
0.6
0.8
1.0
Figure 4.2: Functional medians. The figure on the left shows the medians of the distributions
plotted in Figure 4.1. The figure on the right superimposes the medians over functional data
generated by the corresponding functional random variables.
CHAPTER 4. APPLICATIONS
44
4
3
2
1
0
0.0
0.2
0.4
0.6
0.8
1.0
Figure 4.3: Functional inter-quartile range. This chart plots the functional inter-quartile range
of the distributions shown in Figure 4.1 according to Definition 4.1.
4.1.2
Dispersion
A natural candidate to measure the dispersion of a distribution is the range or the inter-quartile
range. In general, inter-quartile range is preferred in univariate data analysis because: it contains
more information about random variables; it is more robust; and its expected value is asymptotically independent of sample size. Since it is likely that these characteristics carry over to the
functional case, inter-quartile range is a better candidate for generalisation.
Definition 4.1 (Functional inter-quartile range) If X is a functional random variable
with quantiles {Qα }α∈(0,1) then its interquartile range is:
Q0.75 (t) − Q0.25 (t) for t ∈ I
The empirical inter-quartile range is obtained simply by replacing Q with Q̂.
The application of this definition to the functional random variables of Figure 4.1 is shown
in Figure 4.3. For the functional random variables plotted, we conclude that each displays
heteroskedacticity and the dispersion depends linearly on the value of t. The empirical functional
interquartile range of a functional dataset may be used, in principle, to construct a variety a
formal tests.
4.1.3
Skewness
Skewness can measured in an analogous way to the measure of dispersion just presented. A
natural extension of a quantile-based definition of skewness apparently due to Bowley (1926) is:
CHAPTER 4. APPLICATIONS
1.10
45
0.30
0.5
1.08
0.25
0.4
1.06
0.20
0.3
1.04
0.15
0.2
1.02
10
20
30
40
0.10
10
20
30
40
10
20
30
40
Figure 4.4: Describing functional datasets. The left chart shows the medians of the NASDAQ
datasets described in Section 1.5, the middle chart shows interquartile range and the right chart
shows skewness. The blue line shows the behaviour of stocks after a decline of 10% or more; the
green line shows the behaviour of stocks under ordinary conditions.
Definition 4.2 (Functional skewness) If X is a functional random variable with quantiles {Qα }α∈(0,1) then its skewness is:
Q0.75 (t) + Q0.25 (t) − 2Q0.5 (t)
for t ∈ I
Q0.75 (t) − Q0.25 (t)
The empirical skewness is obtained simply by replacing Q with Q̂.
Applying this definition to the functional random variables of Figure 4.1 shows that all three
have constant, zero skewness. This is the expected result because the functional random variable
is constructed from a symmetric marginal distribution — the normal distribution.
The result of applying the equivalent empirical measures of location, dispersion and skewness
to the NASDAQ datasets is shown in Figure 4.4.
4.1.4
Alternatives
Several methods for describing and depicting functional data already exist in the literature.
Perhaps most significant is the work of: Ramsay & Silverman (2005), which describes methods
for estimating empirical means, variances and covariances; Ferraty & Vieu (2006), which identifies
an empirical median; Delaigle & Hall (2010), which identifies an empirical mode; and Hall &
Heckman (2002), which suggests a procedure for depicting the structure of the distribution of a
functional random variable. The method described here is complementary to these methods.
4.2
Detecting outliers
The goal of this section is to describe a supervised classification algorithm for identifying outliers
in functional datasets.
CHAPTER 4. APPLICATIONS
4.2.1
46
The classification process
The first step of the process is to identify candidate outliers. We begin by extending the notion
of upper and lower fences; these are most commonly encountered in univariate boxplots, where
values below the lower fence and above the upper fence are classified as outliers and displayed
accordingly.
Definition 4.3 (Upper and lower fences) If {Q̂α }α∈(0,1) are the empirical quantiles calculated from the functional dataset {X1 , X2 , . . . , Xn }. Then the lower fence is:
Q̂0.25 (t) −
i
3h
Q̂0.75 (t) − Q̂0.25 (t) for t ∈ I
2
Q̂0.75 (t) +
i
3h
Q̂0.75 (t) − Q̂0.25 (t) for t ∈ I
2
And the upper fence is:
This definition may be contrasted with the two dimensional fence defined for bivariate data
in Rousseeuw et al. (1999).
Now we describe the criterion used to select candidate outliers:
Definition 4.4 (Candidate outliers) Let {χ1 , χ2 , . . . , χn } be a functional dataset with quantiles {Q̂α }α∈(0,1) . If χi ∈ {χ1 , χ2 , . . . , χn } and there is some t ∈ I such that: either,
χi (t) < Q̂0.25 (t) −
i
3h
Q̂0.75 (t) − Q̂0.25 (t)
2
χi (t) > Q̂0.75 (t) +
i
3h
Q̂0.75 (t) − Q̂0.25 (t)
2
or
then χi is a candidate outlier.
The candidates identified should then be reviewed to determine if they are genuinely outliers.
This review is not simply a formality; indeed, it is possible to construct functional datasets such
that every observation is a candidate outlier; and it would be unwise to suggest that all such
candidates are genuinely outliers.
4.2.2
An example
The NASDAQ dataset is so large, it is not practical to apply a supervised classification algorithm;
so instead the procedure is illustrated using a dataset described in Ramsay & Silverman (2005).
CHAPTER 4. APPLICATIONS
47
10
10
5
5
10
20
30
40
50
10
-5
-5
-10
-10
20
30
40
50
Figure 4.5: Lower lip movement. The chart on the left plots a subject pronouncing the phoneme
‘b’ 20 times. The chart on the right superimposes the upper and lower functional quartiles of
Definition 3.7 in red and the upper and lower fences of Definition 4.3 in green.
10
10
5
5
10
20
30
40
50
10
-5
-5
-10
-10
20
30
40
50
Figure 4.6: Outliers in lower lip movement data. The chart on the right plots the four outliers
identified, while the chart on the left plots the remaining observations.
The first panel of Figure 4.5 plots twenty observations of the position of a subject’s lower lip with
respect to time while pronouncing the phoneme ‘b’. The data are displayed only after registration.
A substantially similar dataset is used by Gervini (2008) to demonstrate a different algorithm
for detecting outliers in functional data. The second panel of Figure 4.5 superimposes the upper
and lower empirical quartiles of Definition 3.7 and the upper and lower fence of Definition 4.3.
Applying Definition 4.4 to identify any candidate outliers yields five observations. A review
of these candidates showed that one breached the upper fence for only a very brief period, so this
candidate was rejected; the remaining candidates were accepted as outliers. Figure 4.6 shows the
outliers identified in the first panel and the remaining data in the second. A comparison with
Figure 4.5 suggests that the data with outliers removed is substantially more homegeneous than
the initial dataset.
CHAPTER 4. APPLICATIONS
4.2.3
48
Alternatives
There are only a small number of methods for detecting outliers in the functional data analysis
literature: see, for example, Hyndman & Shang (2008) and Gervini (2008). The method of
Hyndman & Shang (2008) is quite similar to the method described here.
4.3
QQ-plots
To the best of my knowledge, this section provides the first description of qq-plots for functional
data. One reason for this lacuna is that previous definitions of quantiles, including those of
Fraiman & Pateiro-López (2011) and Chaudhuri (1996), do not give a unique quantile for each
α ∈ (0, 1), so no directly analogous plot can be constructed. Nonethelss, for finite-dimensional
multivariate distributions Easton & McCulloch (1990) and Liang & Ng (2009) provide some
compelling methods for constructing similar plots.
4.3.1
Univariate qq-plots
First we recall the construction of univariate qq-plots. There are two kinds:
Definition 4.5 (One sample univariate qq-plots) If {X1 , X2 , . . . , Xn } is a dataset generated by a real random variables X with order statistics {X(1) , X(2) , . . . , X(n) } and Y is a real
random variable with quantiles {QY
α }α∈(0,1) then the qq-plot of the dataset {X1 , X2 , . . . , Xn }
compared to Y is the plot of the points:
{(X(i) , QYi−0.5 ) : i ∈ {1, 2, . . . , n}}
n
In Section 3.2.1 we observed that there may be subtle differences in the method used to define
empirical quantiles. These differences carry over into the construction of qq-plots, but they do
not pose any serious difficulty so they are not discussed here.
Definition 4.6 (Two sample univariate qq-plots) If {X1 , X2 , . . . , Xn } and {Y1 , Y2 , . . . Yn }
are datasets generated by real random variables X and Y and {X(1) , X(2) , . . . , X(n) } and {Y(1) , Y(2) , . . . Y(n) }
are their respective order statistics then the two-sample qq-plot of these datasets is the plot of the
points {(X(i) , Y(i) ) : i ∈ {1, 2, . . . , n}}.
If the samples are of different sizes then adjustments must be made to the preceding definition;
these are addressed in Wilk & Gnanadesikan (1968) and since there do not appear to be conceptual difficulties in extending these technical adjustments to the functional case, no discussion is
provided.
Once constructed, qq-plots need to be interpreted. The interpretation is an application of
the the following theorem and corollary:
CHAPTER 4. APPLICATIONS
49
Theorem 4.1 (Univariate qq-plots) If X and Y are real random variables such that Y ∼
aX + b for constants a and b then the points plotted in a qq-plot approach the line y = ax + b as
n→∞
Proof This result follows directly from the equivariance of univariate quantiles under linear
transformations.
Corollary 4.2 If X ∼ Y then then the points plotted in a qq-plot approach the line y = x as
n → ∞.
QQ-plots are interpreted by determining visually whether all points fall on or near the line y = x.
If so, then we conclude the distributions are equal. Alternatively we determine visually whether
all the points fall on or near some line y = ax + b. If so, then we conclude the distributions
are equal modulo location and scale. Examples of two-sample qq-plots are shown in Figure 4.7.
Theorem 4.1 can also be used to construct a formal test on the equality of the distributions
plotted, see Shapiro & Wilk (1965).
3
6
2
4
1
2
0
0
-1
-2
-2
-4
-3
-3
-2
-1
0
1
2
3
-2
-1
0
1
2
3
Figure 4.7: Left is a qq-plot comparing samples of size 1000 both from a N(0, 1) distribution and
right compares a sample from a N(0, 1) distribution on the x-axis and Cauchy(1, 1) distribution
on the y-axis.
4.3.2
Functional qq-plots
These definitions of qq-plots can be extended to the functional quantiles of Definition 3.1 and
3.7.
Definition 4.7 (One sample functional qq-plots) If {X1 , X2 , . . . , Xn } is a dataset generated by a real functional random variables X with order statistics {X(1) (t), X(2) (t), . . . , X(n) (t)}
for each t ∈ I and Y is a real functional random variable with quantiles {QY
α (t)}α∈(0,1) then the
qq-plot of the dataset {X1 , X2 , . . . , Xn } compared to Y is the plot of the lines:
{{(X(i) (t), QYi−0.5 (t))}t∈I : i ∈ {1, 2, . . . , n}}
n
CHAPTER 4. APPLICATIONS
50
1.20
1.20
1.15
1.15
1.10
1.10
1.05
1.05
1.00
1.00
0.95
0.95
0.90
0.90
0.90 0.95 1.00 1.05 1.10 1.15 1.20
1.0
1.1
1.2
1.3
1.4
Figure 4.8: Functional qq-plots. The chart on the left is a qq-plot (of quartiles) for two samples
of size 200,000, both drawn from the set of 40-day NASDAQ stocks price histories immediately
following trading days where there was no single-day decline of 10% or more; the chart on the
right is a qq-plot comparing the 40-day stock price histories following trading days where there
was a single-day decline of 10% or more on the x-axis against 40-day stock price histories following
all other trading days on the y-axis.
Definition 4.8 (Two sample functional qq-plots) If {X1 , X2 , . . . , Xn } and {Y1 , Y2 , . . . Yn }
are datasets generated by real functional random variables X and Y and {X(1) (t), X(2) (t), . . . , X(n) (t)}
and {Y(1) (t), Y(2) (t), . . . Y(n) (t)} for t ∈ I are their respective order statistics then the two-sample
qq-plot of these datasets is the plot of the lines:
{{(X(i) (t), Y(i) (t))}t∈I : i ∈ {1, 2, . . . , n}}
Now we establish an analogous theorem and corollary which should be used to interpret
functional qq-plots.
Theorem 4.3 (Functional qq-plots) If X and Y are real functional random variables such
that Y ∼ aX + b for constants a and b then the lines plotted in a functional qq-plot approach the
line y = ax + b as n → ∞
Proof This theorem is a consequence of the equivariance of the marginal univariate quantiles
of functional random variables under linear transformations.
Corollary 4.4 If X ∼ Y then then the lines plotted in a functional qq-plot approach the line
y = x as n → ∞.
Functional plots are then interpreted in an analogous way, by evaluating how close the lines
plotted are to the line y = x or a line y = ax + b. Examples of functional qq-plots are shown in
Figure 4.8.
CHAPTER 4. APPLICATIONS
4.3.3
51
Alternatives
The qq-plot developed in the previous section may be used for two purposes: (1) to test the
fidelity of a sample to a specified functional random variable; (2) to test the equality of the
distribution of two sets of functional data.
There appear to be no available alternatives for the second purpose, but several are available
for the first purpose: Bugni et al. (2009) define an extension of the empirical distribution function
for functional random variables and demonstrate that an analogue of the Cramer–von-Mises Test
may be applied to this distribution functional; Cuesta-Albertos, Fraiman & Ransford (2007)
demonstrate that a functional random variable can be characterised by carefully chosen random
projections and Cuesta-Albertos, del Barrio, Fraiman & Matrán (2007) propose a goodnessof-fit test based on these random projections. Several specialised tests are developed in the
econometrics literature, see Cai (2011) and Hong & Li (2005).
The tests proposed in Bugni et al. (2009) and Cuesta-Albertos, del Barrio, Fraiman & Matrán
(2007) are an excellent method for testing whether a set of functional data was generated by a
specified distribution, indeed in most situations they may well be more powerful than the tests
described here because of Theorem 3.4. However, functional qq-plots, unlike existing tests, may
also be used to describe the nature and severity of departures from the null hypothesis. Given
that functional data are often generated by processes that cannot be modelled precisely, rejection
of the null hypothesis should not always be the end of the story.
Chapter 5
Epilogue
This chapter draws to a close the work presented in this thesis; in Section 5.1 aspects of the
functional quantiles introduced that require further refinement are highlighted; in Section 5.2
avenues of fruitful enquiry are suggested.
5.1
Further refinements
This thesis establishes several useful characteristic of the functional quantiles introduced and
describes some potential applications; however, further work is required to establish other important theoretical properties and some of the suggested applications should only be applied
cautiously as subtle assumptions are made about the functional dataset analysed. We will now
consider the most pressing deficiencies and suggest methods by which they may be addressed.
5.1.1
Characterising a functional random variable
In Theorem 3.4 we saw that, in general, quantiles do not characterisation the distribution of a
functional random variable. This shortcoming was partially ameliorated by Theorems 3.6 and
3.9 which show that certain classes of functional random variables are characterised by their
quantiles.
A more robust theory should be developed to address two questions: (1) how frequently the
requirements of Theorems 3.6 and 3.9 are satisfied for functional random variables commonly
encountered; and, (2) how should functional datasets be examined, without making any assumptions about their distribution, to determine if they are likely to be generated by a member of a
class of functional random variables which are characterised by their quantiles? This may proceed by an analysis of the behaviour of functional datasets at points where different functional
datums intersect. This analysis may be effective because the behaviour of functional random
variables at the intersection of different realisations is the information lost when a functional
random variable is reduced to its quantiles (see Section 3.1.2).
52
CHAPTER 5. EPILOGUE
53
1.0
0.5
0.5
1.0
1.5
2.0
2.5
3.0
-0.5
-1.0
Figure 5.1: Undetected outliers. This figure plots a synthetic functional dataset, the datum
plotted in red behaves very differently from the other figures but it is not identified as an outlier.
A satisfactory solution to the second question may be coupled with the functional qq-plots,
described in Section 4.3, to provide a more robust assessment of the equality of the distribution
of functional random variables.
5.1.2
More robust measures of dispersion and skewness
In Section 4.1 several methods were suggested for describing the dispersion and skewness of
functional datasets and functional random variables. Each of these methods only used the
location of the quartiles of a functional dataset or functional random variable, as this appears to
be standard practice in the analysis of univariate quantiles.
Important aspects of the dispersion and skewness of functional data may be missed by only
analysing the quartiles of a distribution — for instance the presence of heavy tails — so ideally
better measures of skewness should be developed that are able to satisfactorily describe finegrained information in functional datasets.
5.1.3
Better identification of outliers
In Section 4.2 a procedure for using functional quantiles to detect outliers was outlined. Unfortunately, this procedure is blind to some kinds of exceptional behaviour; see Figure 5.1 for
an example. The outlier described in Figure 5.1 may be detected by applying the procedure of
Section 4.2.1 to the derivative of the functional dataset. However, this is not a panacea: it is relatively easily to construct a similar functional dataset such that each member is not continuously
differentiable and differences in the functions arise primarily at points of non-differentiability —
further work is required to identify such outliers.
CHAPTER 5. EPILOGUE
5.1.4
54
Problems with functional qq-plots
In Section 4.3, a method for constructing functional qq-plots was described. To ensure that the
qq-plot did not become too busy and difficult to interpret, we plotted only the quartiles of the
distribution of a functional dataset. Unfortunately as described in the prior section, this means
that fine-grained information contained in the functional dataset is ignored and conclusions drawn
on the basis of incomplement information may sometimes be faulty. The solution to this problem
may be to construct a separate plot for each quantile of the functional dataset and inspect these
plots individually.
In Section 4.3 we also saw that one of the key strengths of functional qq-plots is that they may
be used to describe departures from the null hypothesis qualitatively. However, no procedures
for describing such departures are provided in this thesis. These procedures may be developed
by generating pairs of synthetic functional datasets where there are only small differences in the
distribution of each dataset. We could then examine the effect of these differences on the form
of functional qq-plots. If such differences also appear in functional qq-plots constructed with
datasets where the underlying distributions are unknown then this is prima facie evidence for
the existence of similar differences in such datasets.
5.2
Future research directions
In univariate and finite-dimensional multivariate data analysis, there is a rich literature on the
use of quantiles in almost all aspects of statistics, quantiles are used to describe datasets, to
conduct tests on datasets and in regression analysis. In principle, each of these methods can be
extended to the functional quantiles introduced in this thesis. We will consider some methods
which may be adapted for use with functional quantiles most readily.
5.2.1
Using quantiles to construct formal tests
Serfling (2002) suggests that one of the criteria on which definitions of multivariate quantiles
should be evaluated is their suitability for constructing formal hypothesis tests. In this thesis we
have averted to the fact that the quantiles introduced may be used for this purpose, however,
no such tests have been illustrated. The difficulty in constructing such tests arises in accounting
for the infinite-dimensionality of functional quantiles.
5.2.2
A goodness-of-fit test based on functional qq-plots
Shapiro & Wilk (1965) describe an intuitive method of reducing the univariate qq-plot to a
single statistic that may be used to conduct an ordinary hypothesis test. This approach could
be extended to the functional case. This test should be developed after the problems identified
in Section 5.1.1 are addressed to ensure that the null hypothesis is not accepted when there are
CHAPTER 5. EPILOGUE
55
significant differences in aspects of the distribution of functional random variables that are not
captured by their quantiles.
Bibliography
Aneiros, G., Cao, R., Vilar-Fernández, J. M. & Muñoz San-Roque, A. (2011), Functional prediction for the residual demand in electricity spot markets, in F. Ferraty, ed., ‘Recent Advances
in Functional Data Analysis and Related Topics’, Contributions to Statistics, Physica-Verlag
HD, pp. 9–15.
URL: http: // dx. doi. org/ 10. 1007/ 978-3-7908-2736-1_ 2
Black, F. & Scholes, M. (1973), ‘The pricing of options and corporate liabilities’, Journal of
Political Economy 81(3), pp. 637–654.
URL: http: // dx. doi. org/ 10. 1086/ 260062
Bowley, A. L. (1926), Elements of statistics, 5th edn, Academic Press, San Diego, CA, USA.
Bremer, M. & Sweeney, R. J. (1991), ‘The reversal of large stock-price decreases’, The Journal
of Finance 46(2), pp. 747–754.
URL: http: // dx. doi. org/ 10. 2307/ 2328846
Bugni, F. A. (2010), ‘Specification test for missing functional data’. Working Paper.
URL: http: // papers. ssrn. com/ sol3/ papers. cfm? abstract_ id= 1593736
Bugni, F. A., Hall, P., Horowitz, J. L. & Neumann, G. R. (2009), ‘Goodness-of-fit tests for
functional data’, Econom. J. 12(S1), S1–S18.
URL: http: // dx. doi. org/ 10. 1111/ j. 1368-423X. 2008. 00266. x
Cai, Z. (2011), Functional coefficient models for economic and financial data, in F. Ferraty &
Y. Romain, eds, ‘The Oxford Handbook of Functional Data Analysis’, Oxford Handbooks in
Mathematics Series, Oxford University Press.
Chaudhuri, P. (1996), ‘On a geometric notion of quantiles for multivariate data’, J. Amer. Statist.
Assoc. 91(434), 862–872.
URL: http: // dx. doi. org/ 10. 2307/ 2291681
Cressy, R. & Farag, H. (2011), ‘Do size and unobservable company factors explain stock price
reversals?’, Journal of Economics and Finance 35, 1–21. 10.1007/s12197-009-9076-4.
URL: http: // dx. doi. org/ 10. 1007/ s12197-009-9076-4
56
BIBLIOGRAPHY
57
Cuesta-Albertos, J. A., del Barrio, E., Fraiman, R. & Matrán, C. (2007), ‘The random projection
method in goodness of fit for functional data’, Comput. Statist. Data Anal. 51(10), 4814–4831.
URL: http: // dx. doi. org/ 10. 1016/ j. csda. 2006. 09. 007
Cuesta-Albertos, J. A., Fraiman, R. & Ransford, T. (2007), ‘A sharp form of the Cramér-Wold
theorem’, J. Theoret. Probab. 20(2), 201–209.
URL: http: // dx. doi. org/ 10. 1007/ s10959-007-0060-7
David, H. (1970), Order Statistics, John Wiley & Sons, New York, NY, USA.
De Bondt, W. F. M. & Thaler, R. (1985), ‘Does the stock market overreact?’, The Journal of
Finance 40(3), pp. 793–805.
URL: http: // dx. doi. org/ 10. 2307/ 2327804
Delaigle, A. & Hall, P. (2010), ‘Defining probability density for a distribution of random functions’, Ann. Statist. 38(2), 1171–1193.
URL: http: // dx. doi. org/ 10. 1214/ 09-AOS741
Easton, G. S. & McCulloch, R. E. (1990), ‘A multivariate generalization of quantile-quantile
plots’, Journal of the American Statistical Association 85(410), pp. 376–386.
URL: http: // dx. doi. org/ 10. 2307/ 2289773
Ferraty, F., ed. (2010), Journal of Multivariate Analysis, Vol. 101. Issue 2. Statistical Methods
and Problems in Infinite-dimensional Spaces, 1st International Workshop on Functional and
Operatorial Statistics (IWFOS’2008).
URL: http: // dx. doi. org/ 10. 1016/ j. jmva. 2009. 10. 012
Ferraty, F., ed. (2011), Recent Advances in Functional Data Analysis and Related Topics, Contributions to Statistics, Physica-Verlag HD.
URL: http: // dx. doi. org/ 10. 1007/ 978-3-7908-2736-1
Ferraty, F. & Romain, Y., eds (2011), The Oxford Handbook of Functional Data Analysis, Oxford
Handbooks in Mathematics Series, Oxford University Press.
Ferraty, F. & Vieu, P. (2006), Nonparametric Functional Data Analysis: Theory and Practice,
Springer-Verlag New York, Inc., Secaucus, NJ, USA.
Fraiman, R. & Pateiro-López, B. (2011), Functional quantiles, in F. Ferraty & Y. Romain, eds,
‘Recent Advances in Functional Data Analysis and Related Topics’, Contributions to Statistics,
Physica-Verlag HD, pp. 123–129.
URL: http: // dx. doi. org/ 10. 1007/ 978-3-7908-2736-1_ 19
Froda, A. (1929), ‘Sur la distribution des propriétés de voisinage des fonctions de variables
réelles’, Docteur ès Sciences Mathématiques Thesis. Université de Paris.
BIBLIOGRAPHY
58
Geenens, G. (2011), A nonparametric functional method for signature recognition, in F. Ferraty,
ed., ‘Recent Advances in Functional Data Analysis and Related Topics’, Contributions to
Statistics, Physica-Verlag HD, pp. 141–147. 10.1007/978-3-7908-2736-1 22.
URL: http: // dx. doi. org/ 10. 1007/ 978-3-7908-2736-1_ 22
Gervini, D. (2008), ‘Robust functional estimation using the median and spherical principal components’, Biometrika 95(3), 587–600.
URL: http: // dx. doi. org/ 10. 1093/ biomet/ asn031
Goswami, A. & Rao, B. V. (2004), ‘A theorem on compatibility of systems of sets with applications’, Lecture Notes-Monograph Series 45, pp. 332–336.
URL: http: // www. jstor. org/ stable/ 4356320
Groeneveld, R. A. (1998), ‘A class of quantile measures for kurtosis’, The American Statistician
52(4), pp. 325–329.
URL: http: // dx. doi. org/ 10. 2307/ 2685435
Hall, P. & Heckman, N. E. (2002), ‘Estimating and depicting the structure of a distribution of
random functions’, Biometrika 89(1), 145–158.
URL: http: // dx. doi. org/ 10. 1093/ biomet/ 89. 1. 145
Heston, S. L. (1993), ‘A closed-form solution for options with stochastic volatility with applications to bond and currency options’, The Review of Financial Studies 6(2), pp. 327–343.
URL: http: // dx. doi. org/ 10. 1093/ rfs/ 6. 2. 327
Hong, Y. & Li, H. (2005), ‘Nonparametric specification testing for continuous-time models with
applications to term structure of interest rates’, Review of Financial Studies 18(1), 37–84.
URL: http: // dx. doi. org/ 10. 1093/ rfs/ hhh006
Hyndman, R. & Shang, H. L. (2008), Bagplots, boxplots and outlier detection for functional data,
in S. Dabo-Niang & F. Ferraty, eds, ‘Functional and Operatorial Statistics’, Contributions to
Statistics, Physica-Verlag HD, pp. 201–207.
URL: http: // dx. doi. org/ 10. 1007/ 978-3-7908-2062-1_ 31
Jones, H. E. & Bayley, N. (1941), ‘The Berkeley growth study’, Child Development 12(2), pp.
167–173.
URL: http: // dx. doi. org/ 10. 2307/ 1125347
Kong, L. & Mizera, I. (2008), ‘Quantile tomography: using quantiles with multivariate data’.
arXiv:0805.0056v1.
URL: http: // arxiv. org/ abs/ 0805. 0056
Liang, J. & Ng, K. W. (2009), ‘A multivariate normal plot to detect nonnormality’, Journal of
Computational and Graphical Statistics 18(1), 52–72.
URL: http: // dx. doi. org/ /10. 1198/ jcgs. 2009. 0004
BIBLIOGRAPHY
59
Mazouz, K., Joseph, N. L. & Joulmer, J. (2009), ‘Stock price reaction following large one-day
price changes: UK evidence’, Journal of Banking & Finance 33(8), 1481–1493.
URL: http: // dx. doi. org/ 10. 1016/ j. jbankfin. 2009. 02. 010
Moors, J. J. A. (1988), ‘A quantile alternative for kurtosis’, Journal of the Royal Statistical
Society. Series D (The Statistician) 37(1), pp. 25–32.
URL: http: // dx. doi. org/ 10. 2307/ 2348376
Nadaraya, E. A. (1964), ‘On estimating regression’, Theory of Probability and its Applications
9, 141–142.
URL: http: // dx. doi. org/ 10. 1137/ 1109020
R Development Core Team (2011), R: A Language and Environment for Statistical Computing,
R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0.
URL: http: // www. R-project. org
Ramsay, J. (2011), Curve registration, in F. Ferraty & Y. Romain, eds, ‘The Oxford Handbook
of Functional Data Analysis’, Oxford Handbooks in Mathematics Series, Oxford University
Press.
Ramsay, J. O. & Silverman, B. W. (1997), Functional Data Analysis, Springer Series in Statistics,
1st edn, Springer-Verlag.
Ramsay, J. O. & Silverman, B. W. (2002), Applied Functional Data Analysis, Springer Series in
Statistics, 1st edn, Springer-Verlag.
URL: http: // dx. doi. org/ 10. 1007/ b98886
Ramsay, J. O. & Silverman, B. W. (2005), Functional Data Analysis, Springer Series in Statistics,
2nd edn, Springer-Verlag.
Ramsay, J. O., Wickham, H., Graves, S. & Hooker, G. (2010), fda: Functional Data Analysis. R
package version 2.2.5.
URL: http: // CRAN. R-project. org/ package= fda
Rousseeuw, P. J., Ruts, I. & Tukey, J. W. (1999), ‘The bagplot: A bivariate boxplot’, The
American Statistician 53(4), pp. 382–387.
URL: http: // dx. doi. org/ 10. 2307/ 2686061
Serfling, R. (2002), ‘Quantile functions for multivariate analysis: approaches and applications’,
Statistica Neerlandica 56(2), 214–232.
URL: http: // dx. doi. org/ 10. 1111/ 1467-9574. 00195
Shapiro, S. S. & Wilk, M. B. (1965), ‘An analysis of variance test for normality (complete
samples)’, Biometrika 52(3-4), 591–611.
URL: http: // dx. doi. org/ 10. 1093/ biomet/ 52. 3-4. 591
BIBLIOGRAPHY
60
Watson, G. S. (1964), ‘Smooth regression analysis’, Sankhyā: The Indian Journal of Statistics,
Series A 26(4), pp. 359–372.
URL: http: // www. jstor. org/ stable/ 25049340
Wilk, M. B. & Gnanadesikan, R. (1968), ‘Probability plotting methods for the analysis of data’,
Biometrika 55(1), 1–17.
URL: http: // dx. doi. org/ 10. 1093/ biomet/ 55. 1. 1
Download