Overview Functional PCA finite element basis Estimation Examples Extensions and discussion A Functional Covariance Model Jim Ramsay, McGill University SAMSI Workshop on the Interface of Functional and Longitudinal Data Analysis November 9, 2010 Overview Functional PCA finite element basis Estimation Examples Extensions and discussion Where we go in this talk The need for a flexible representation of functional covariance is indicated. We use a functional version of the Choleski decomposition Σ = L0 L to construct fixed lag covariance kernels. We control smoothness of these kernels by using a linear finite element expansion of the Choleski factor. Some examples are presented. Extensions and applications are suggested. Overview Functional PCA finite element basis Estimation Examples Extensions and discussion Outline 1 Goals, Ambitions, and Context 2 A functional factor analysis model 3 A finite element basis for functional covariance 4 Estimation of functional covariance from data 5 Examples 6 Extensions and discussion Overview Functional PCA finite element basis Estimation Examples Extensions and discussion The need for nonparametric functional models of covariance Structured variance–covariance matrices are familiar in multivariate data analysis: band-structured block diagonal patterned parametric models Structure is important when the number of variables p gets at all large because one often needs to use fewer than p(p + 1)/2 degrees of freedom to represent covariance, esp. in mixed effect designs. It may also be that variables are clustered or organized in other ways. Confirmatory factor analysis and structural equation modeling are the usual methods for structured covariance problems. Overview Functional PCA finite element basis Estimation Examples Extensions and discussion Longitudinal data analysis for equi-spaced times When the number of time points n becomes at all large (5 or more), it can be critical to conserve degrees of freedom used to represent covariation, and especially for mixed effect designs It can be reasonable to assume that covariances are substantial over only limited time lags Stationary and band-structured covariance structures are often used. Overview Functional PCA finite element basis Estimation Examples Extensions and discussion Continuous, random and unequally spaced observation times These situations arise in spatial domains, for random point process observations times, and for fixed design sampling designs where dense sampling is needed over certain intervals. Many biological and epidemiological processes are of this nature. Covariance models here are limited, parametric, and tend to assume a lot, such as stationarity. We need flexible nonparametric modeling tools expressed in continuous time. The number of parameters p that we use must not depend on the number n of sampling points, since n may be very large or itself a random variable. Overview Functional PCA finite element basis Estimation Examples Extensions and discussion An alternative to PCA of covariates in linear regression Functional linear models with functional predictors require the inversion of a covariance kernel. But typically the sample size is less than the dimensionality of each datum, in which case the usual covariance kernel estimate is singular. A truncated Karhunen-Loeve expansion (PCA) is often used to finesse this problem, but risks discarding useful and interesting variation. We need to an alternative strategy for estimating this kernel that is both low-dimensional and nonsingular. Overview Functional PCA finite element basis Estimation Examples Extensions and discussion Outline 1 Goals, Ambitions, and Context 2 A functional factor analysis model 3 A finite element basis for functional covariance 4 Estimation of functional covariance from data 5 Examples 6 Extensions and discussion Overview Functional PCA finite element basis Estimation Examples Extensions and discussion The multivariate factor analysis model The unconstrained factor analysis model for an order p sample covariance matrix S is Σ = ΛΛ0 + Ψ Λ is an p by k factor loadings matrix, k being the number of factors with k << p. Ψ is an order p diagonal matrix with diagonal entries being the unique variances ψj ≥ 0. Social scientists love this model because they work with large numbers p of variables, and therefore need a low-dimensional representation, and Ψ captures the error variation that varies from variable to variable and that can be assumed to be roughly uncorrelated across variables. Overview Functional PCA finite element basis Estimation Examples Extensions and discussion The functional factor analysis model Matrix Σ is now a bivariate function σ such that σ(s, t) is the covariance between the values of a sample or population of functional observations at times s and t. The factor analysis model is now Z σ(s, t) = λ(s, w)λ(t, w) dw + ψ(s, t) It may also be that the second term represents independent white noise contributions, so that ψ(s, t) = 0, s 6= t Overview Functional PCA finite element basis Estimation Examples Extensions and discussion The principal component function λ Z σ(s, t) = λ(s, w)λ(t, w) dw + ψ(s, t) Let’s assume observations defined over [0, T ]. But λ need not vary over [0, T ] as a function of w. In fact, we require that λ(s, w) = 0, w > s corresponding to the lower triangular matrix L0 in the Choleski decomposition, Σ = L0 L. Overview Functional PCA finite element basis Estimation Examples Extensions and discussion The range of w Z σ(s, t) = λ(s, w)λ(t, w) dw + ψ(s, t) The variable of integration w varies over [−B, T ]. We use a trapezoidal domain of λ in such a way that B B B B > T places no restriction on σ = T implies that σ(0, T ) = 0. < T implies that σ(s, t) = 0 for |s − t| > δ > 0. = 0 implies that σ(s, t) = 0 for s 6= t. Overview Functional PCA finite element basis Estimation Examples Extensions and discussion A band-structured λ domain for [0, 4] for B = 2 σ(0.5, 3.5) = 0 Overview Functional PCA finite element basis Estimation Examples Extensions and discussion Outline 1 Goals, Ambitions, and Context 2 A functional factor analysis model 3 A finite element basis for functional covariance 4 Estimation of functional covariance from data 5 Examples 6 Extensions and discussion Overview Functional PCA finite element basis Estimation Examples Extensions and discussion A finite element basis for λ To make this work, we need to represent λ in terms of a basis expansion λ(s, w) = K X L X k ck ` φk` (s, w) ` The finite element basis system is widely used in the approximation of solutions to partial differential equations. It begins by subdividing the domain into triangular subregions. In the case of λ defined over a parallelogram, this is easy. Overview Functional PCA finite element basis Estimation Examples A triangulation of the domain of λ Extensions and discussion Overview Functional PCA finite element basis Estimation Examples Extensions and discussion A finite element basis for λ A basis function φk ` (s, w) is defined for each vertex in the triangulation. We number the vertices vertically with k = 0, . . . , K and horizontally from right to left with ` = 0, . . . , L; L ≤ K . There are (I + 1)(J + 1) vertices Overview Functional PCA finite element basis Estimation Examples Extensions and discussion A triangulation of the domain of λ with vertices Overview Functional PCA finite element basis Estimation Examples Extensions and discussion Tent basis functions φk` (s, w) satisfies: φk` (s, w) is piecewise linear, rather like order 2 splines. φk` (s, w) = 1 at vertex (k, `). φk` (s, w) = 0 on edges opposite edges of triangles sharing vertex k , and All φk` (s, w) = 0 everywhere else, and basis functions for edge vertices vanish outside the trapezoidal domain. As with order 2 splines, basis function (k , `) is a tent function defined over the hexagon with vertex (k, `) at its center. These basis functions are called first order Lagrangian elements in the numerical literature. Overview Functional PCA finite element basis Estimation Examples Extensions and discussion Tent basis domains for vertices (1,1) and (3,1) Overview Functional PCA finite element basis Estimation Examples Tent basis function for vertex (2,1) Extensions and discussion Overview Functional PCA finite element basis Estimation Examples Extensions and discussion Properties of σ The covariance kernel σ is piecewise quadratic, and continuous. σ(s, t) = 0 when |s − t| ≥ δ. R As long as λ2 (s, w)dw > 0 for all s, σ will be positive definite. σ can be further constrained by fixing specified values for coefficients ck` . Overview Functional PCA finite element basis Estimation Examples Extensions and discussion A piecewise linear covariance function for [0,4] Overview Functional PCA finite element basis Estimation Examples Extensions and discussion A random covariance function for [0,4] (c ∼ N(0, 1)) c ∼ N(0, 1) Overview Functional PCA finite element basis Estimation Examples Extensions and discussion Expressing σ(s, t) (matrix notation) σ(s, t) = c0 R(s, t)c and σ(t, s) = c0 R0 (s, t)c where c is the set of coefficients ck` put in vector form and order (I + 1)(J + 1) matrix R(s, t) contains in corresponding order the cross-product integrals Z T −B φk1 `1 (s, w)φk2 `2 (t, w) dw for all sets {k1 , `1 , k2 , `2 } The quadratic dependency of σ on c implies rapid convergence, and starting values for c can be set up using nonlinear least squares fitting. Overview Functional PCA finite element basis Estimation Examples Extensions and discussion R(s, t) is sparse Because the supports of the functions are small, most of these cross-products are zero. Devising an efficient algorithm to compute the nonzero cross-products was the main technical challenge in the project. An algorithm for computing R(s, t) is available in Matlab and C. Fast computation and economical storage is achieved by computing with each of these matrices stored in sparse storage mode. Cross-products are computed once and for all before the fitting phase, and do not need to be re-computed during an optimization of a fitting criterion. Overview Functional PCA finite element basis Estimation Examples Extensions and discussion Outline 1 Goals, Ambitions, and Context 2 A functional factor analysis model 3 A finite element basis for functional covariance 4 Estimation of functional covariance from data 5 Examples 6 Extensions and discussion Overview Functional PCA finite element basis Estimation Examples Extensions and discussion Data fitting methods σ can be fit to data in many ways. A Wishart-based loss function can be used to estimate σ from a sample discrete variance-covariance matrix Σ by maximum likelihood. A Gaussian-based likelihood can be used to estimate σ directly from either discrete or continuous data. Overview Functional PCA finite element basis Estimation Examples Extensions and discussion Tensor notation Computing functions, gradients and hessians for multi-index objects like R(s, t), which has six indices, is made much easier by using tensor notation. Einstein summation notation specifies that there is summation over repeated indices. Repeated indices usually occur in subscript/superscript pairs, called covariant and contravariant indices, respectively. Thus, in tensor notation σ(si , tj ) = ck1 `1 r k1 `1 k2 `2 (si , tj )ck2 `2 Overview Functional PCA finite element basis Estimation Examples Extensions and discussion The Wishart log likelihood Let S and Σ be the sample and population variance-covariance matrices, respectively. The negative log likelihood (dropping unneeded constant terms) in matrix notation is F (S, Σ) = log |Σ| + trace(SΣ−1 ) and in tensor notation is F (S ij , Σij ) = log |Σij | + S ij Σij . Indices switched from covariant to contravariant, and vice versa, indicate inversion. Note the use of covariant indices for the inverse Σ−1 of Σ. The implied double summation produces the trace value. Overview Functional PCA finite element basis Estimation Examples Extensions and discussion The factor analysis model The unconstrained factor analysis model for an order p sample covariance matrix S is Σ = ΛΛ0 + Ψ where Λ is an p by k factor loadings matrix, k being the number of factors with k << p; and Ψ is an order p diagonal matrix with diagonal entries being the unique variances ψj ≥ 0. In tensor notation Σij = Λik Λjk + Ψij Overview Functional PCA finite element basis Estimation Examples Extensions and discussion The derivative of F with respect to Σ The derivative is ∂F = Gij ∂Σ ij It is covariant in both indices. This is worked out in any multivariate statistics book, and is, in matrix notation, G = Σ−1 (Σ − S)Σ−1 In tensor notation, this is Gij = Σik (Σk` − S k` )Σ`j Overview Functional PCA finite element basis Estimation Examples Extensions and discussion The gradient of F with respect to ck ` Using the chain rule, the gradient is ∂σ(si , t j ) k` ∂F k` = Gij ∂c ∂c Four-dimensional array ∂σ(si , t j ) k1 `1 ∂c is r k1 `1 k2 `2 (si , t j ) + r `1 k1 `2 k2 (si , t j ) ck2 `2 Overview Functional PCA finite element basis Estimation Examples Extensions and discussion The hessian of F with respect to ck1 `1 and ck2 `2 This is somewhat more complex than the gradient, but in tensor notation is completely straightforward, and easily translatable into code. Overview Functional PCA finite element basis Estimation Examples Extensions and discussion Outline 1 Goals, Ambitions, and Context 2 A functional factor analysis model 3 A finite element basis for functional covariance 4 Estimation of functional covariance from data 5 Examples 6 Extensions and discussion Overview Functional PCA finite element basis Estimation Examples Extensions and discussion A simulated example The following frames show: A true or population covariance kernel A sample value generated by sampling from the Wishart distribution with 51 degrees of freedom The estimate of the population matrix using I = 4 and J = 2. Overview Functional PCA finite element basis Estimation Examples The population covariance function σ Extensions and discussion Overview Functional PCA finite element basis Estimation Examples The sample covariance function S Extensions and discussion Overview Functional PCA finite element basis Estimation Examples Extensions and discussion The sample covariance function (another view) Overview Functional PCA finite element basis Estimation Examples The estimated covariance function Σ̂ Extensions and discussion Overview Functional PCA finite element basis Estimation Examples Extensions and discussion The estimated covariance function (another view) Overview Functional PCA finite element basis Estimation Examples Extensions and discussion A real data example: Residuals from height measurements Height measurements were obtained for 51 girls at 26 unequally spaced ages ranging from 3 to 18 years. These data were fit by smooth monotone functions, each girl’s data being fit separately. The differences between the actual measurements and the corresponding function values were computed. The next figure shows the covariances among these differences or fit residuals. Overview Functional PCA finite element basis Estimation Examples Extensions and discussion The covariance function for height residuals (Observation time grid) Overview Functional PCA finite element basis Estimation Examples Extensions and discussion What we need Height measurements have larger variances in early childhood. Covariances are negative over small lags, become positive, and then die out. We use a band-structured covariance estimate defined by I = 8 and J = 3. The next figure shows the estimated covariances among these differences. Overview Functional PCA finite element basis Estimation Examples Extensions and discussion The covariance function for height residuals Overview Functional PCA finite element basis Estimation Examples Extensions and discussion The estimated covariance (Observation time grid) Overview Functional PCA finite element basis Estimation Examples The estimated covariance (fine grid) Extensions and discussion Overview Functional PCA finite element basis Estimation Examples Extensions and discussion Outline 1 Goals, Ambitions, and Context 2 A functional factor analysis model 3 A finite element basis for functional covariance 4 Estimation of functional covariance from data 5 Examples 6 Extensions and discussion Overview Functional PCA finite element basis Estimation Examples Extensions and discussion Sums of basis systems The factor analysis model suggests a model involving two basis systems: A high resolution basis but with very short lag δ to capture localized noisy sources of variation. A low resolution basis but with a longer lag to capture smoother sources of functional variation. This is easy to implement. Overview Functional PCA finite element basis Estimation Examples Extensions and discussion Varying sampling points Often the times of observation vary from one function to another. Observation times are random. Time is warped through registration for each curve. Data are often missing. The Wishart loss function can be replaced by a sum of one degree of freedom losses to allow for this. Overview Functional PCA finite element basis Estimation Examples Extensions and discussion Spatial covariances In principle linear finite elements can be constructed as 4-simplices. In fact, working with tensor products the bases used here would be quite convenient for spatial data distributed over rectangular regions.