ABSTRACTS - Leuven Statistics Research Centre (LStat)

advertisement

ABSTRACTS PRESENTATIONS

1 ST LEUVEN STATISTICAL DAY

12 June 2006

A large variety of applied and theoretical statistical research is performed at the K.U.

Leuven. The aim of the annual Leuven Statistical Day is to highlight this research, not only to our colleagues but to a wide audience. However, it is impossible to illustrate all statistical activity at our university in one day and therefore we decided to select each year a different topic. This year we have chosen Mixed Models as a common theme.

This class of models enjoys a growing interest, both from a theoretical as from an applied point of view.

The keynote lecture at our first Leuven Statistical Day is given jointly by Geert Verbeke and Geert Molenberghs. Both are internationally recognized for their books on mixed models and longitudinal data. Their lecture will first review the basic material on linear, generalized linear and non-linear mixed models. In the second half of the lecture, they will treat the more novel aspects of mixed models.

After the keynote lecture, a number of K.U.Leuven researchers from different faculties will present their research on mixed models and illustrate their richness in a variety of areas of application.

Opportunities and Challenges in Mixed Models and Incomplete Data

Geert Verbeke* and Geert Molenberghs +

* Faculty of Medicine, K.U. Leuven

+ Center for Statistics, Hasselt University

Over the last couple of decades the linear mixed model, followed by its generalized linear and non-linear versions, have become widespread tools for the analysis of broad classes of longitudinal, clustered or, generally, hierarchical data. Not only its elegant properties, but also the availability of implementations in standard software packages, have contributed to the proliferation of this modeling framework.

In the first part of this presentation, the mixed model setting will be introduced. The beauty and elegance of the paradigm will be juxtaposed with a number of important but non-trivial issues in interpretation, estimation, and inference.

In the second part, we zoom in on the particular problem of hypothesis testing for variance components. This is one important instance where a seemingly straightforward problem necessitates a non-standard solution in the mixed-model context.

Third, in practice one is increasingly confronted with questions that can only be answered by analyzing simultaneously a number of outcomes each of which separately would already be described by a mixed model. This calls for multivariate extensions. A particularly flexible approach will be presented.

Finally, complex hierarchical data of the type discussed here are particularly prone to missingness. We will sketch a broad framework to deal with incompleteness, showing that mixed models can in many contexts serve to define the primary analysis. In addition, we highlight their role for sensitivity analyses.

All aspects discussed above will be illustrated using real life examples.

Attitudes Towards Migration in Europe:

A Cross-Cultural and Contextual Approach.

Bart Meuleman

Faculty of Social Sciences, K.U. Leuven

Given the tendency towards harmonization of the migration policies of the EU-member countries, international comparisons of attitudes to migration are highly relevant. In this contribution, individual and particularly contextual determinants of these attitudes are studied. Data from the European Social Survey (ESS) round 1 is used. In this largescale international survey, 22 European countries participated.

Realistic group-conflict theory is our theoretical point of departure. According to this theoretical framework, economic competition between ethnic groups is the main source of ethnocentrism. Consequently, we hypothesize that the overall performance of the national economy and the size of the immigrant population will influence attitudes towards migration among the autochthonous population. Apart from these variables at the national level, individual control variables (such as age, sex and education) are taken into account.

Multi-level modelling is an appropriate statistical tool for the analysis of cross-cultural survey data, since respondents are nested within countries. This technique allows taking up individual as well as national variables in one and the same model. Furthermore, it is possible to test for cross-level interactions.

A Double-Structure Structural Equation Model for Three-Mode Data

Jorge González, Paul De Boeck and Francis Tuerlinckx

Faculty of Psychology and Educational Sciences, K.U. Leuven

Individual differences and situational differences are important phenomena in the study of human behavior. The regular structural equation model (SEM) considers latent person variables to account for individual differences, but it does not consider a latent structure for situational differences. In this paper a general version of the SEM called Double-

Structure Structural Equation Model (2sSEM) is presented which incorporates simultaneously a latent structure for individual and situational differences. The model is evaluated using a three-mode data set of persons by situations by responses about emotions.

Mixed models for postharvest quality assessment of horticultural products

Bart De Ketelaere and Jeroen Lammertyn

Faculty of Bioscience Engineering, K.U. Leuven

In the field of postharvest quality assessment of horticultural products, research on the development of non-destructive quality sensors, replacing destructive and often time consuming sensors, has spurred in the last decennium offering the possibility of taking repeated quality measures on the same product. The flexibility of mixed models, both towards the repeated measures design of the experiments as towards the large inter- and intra product biological variability inherent to these horticultural products, is an important advantage over classical techniques. In this lecture the benefits of mixed models to describe quality behavior of horticultural produce will be illustrated with three practical examples in the field of postharvest technology: (i) describing tomato firmness evolution during postharvest storage with linear mixed models, (ii) modelling the time effect of physical treatments on strawberry quality with mixed models for repeated multicategorical response and (iii) tomato cultivar grouping based on pre- and postharvest color profiles using non-linear mixed models.

Using Mixed Models in Actuarial Statistics

Katrien Antonio

Faculty of Science, K.U. Leuven

We discuss how mixed models can be applied in the analysis of insurance data and the decision making process following it. Starting point are traditional credibility models and their connection with linear mixed models. The credibility ratemaking problem concerns the prediction of future claims of a risk class, given past claims of that and related risk classes. Traditional credibility formulas, like B ühlmann (1967, 1969), Bühlmann-Straub

(1970) or Hachemeister (1975), can be reconstructed using the explicit expressions for the MLE of the fixed effects and the BLUP for the random effects in a linear mixed model. This appealing analogy is a first step towards the interpretation of traditional credibility schemes in the framework of generalized linear models, using the methodology of generalized linear mixed models. Next to the credibility ratemaking problem, loss reserving with mixed models is discussed. The problem of loss reserving arises when claims originating in a particular year can not be finalized in the same year.

Therefore, actuaries will hold provisions or reserves to meet future obligations towards their policy holders. Traditional statistical techniques for claims reserving are framed within a so-called run-off triangle and use (generalized) linear regression. Using the concept of mixed models, their connection with smoothing methods and their implementation with Bayesian statistics, we present some new and promising alternatives for the techniques that are currently used. The talk ends with a discussion of some new statistical approaches for non-life ratemaking with clustered data. In the framework of distributions from the exponential family, as well as in the context of heavytailed distributions, two approaches are considered: copula modelling and modelling via mixed models. During the talk, special attention is given to prediction, since this is a major concern for actuaries.

Poverty Dynamics in Europe.

A Multilevel Discrete-Time Recurrent Hazard Analysis

Marc Callens and Christophe Croux

Faculty of Economics and Applied Economics, K.U. Leuven

In this paper we use multilevel discrete-time recurrent hazard analysis to simultaneously model the impact of life cycle events and structural processes on poverty entry and exit across European Regions. Research questions are, (i) what is the importance of life cycle events on the road to entry into and exit from poverty, (ii) are there any differences in poverty dynamics between European Regions and if so, how can we explain these differences. The analysis is based on individual and household panel data of the

European Community Household Panel linked with a regional time series database.

Main findings are that men's poverty dynamics is dominated by employment-related events, while for women demographic events also play a role. Regional structural factors only have a slight or no influence on poverty transitions, but the welfare regime turns out to be highly significant for poverty entry.

Prediction of Renal Graft Failure Using Multivariate Longitudinal Profiles

Steffen Fieuws

Faculty of Medicine, K.U. Leuven

Patients who underwent renal transplantation are monitored longitudinally at irregular time intervals during 10 years. Each visit yields a set of biochemical and physiological markers containing valuable information to anticipate a failure of the graft. General linear, generalized linear and nonlinear mixed models are used to describe the longitudinal profiles of each marker. The univariate mixed models are joined into a multivariate mixed model by specifying a joint distribution for the random effects. Due to the high number of markers, a pairwise modeling strategy, where all possible pairs of bivariate mixed models are fitted, has been used to obtain parameter estimates for the multivariate mixed model. These estimates are used in a Bayes

’ rule to calculate at each point in time the risk that the graft will fail in the remaining time of the 10-year period since the moment of transplantation.

Penalized Splines as Mixed Models for Survey Estimation

Gerda Claeskens

Faculty of Economics and Applied Economics, K.U. Leuven

A class of estimators based on penalized spline regression is proposed. We use these to estimate finite population totals in the presence of auxiliary information. These estimators have many appealing features: like classical survey regression estimators, they are weighted linear combinations of sample observations, with weights calibrated to known control totals. Further, they allow straightforward extensions to multiple auxiliary variables and to complex designs. Data-driven penalty selection using restricted maximum likelihood is considered in the context of unequal probability sampling designs.

Further, we propose a new small area estimation approach that combines small area random effects with a smooth, nonparametrically specified trend. By using penalized splines as the representation for the nonparametric trend, it is possible to express the small area estimation problem as a mixed effect regression model. This model is readily fitted using existing model fitting approaches such as restricted maximum likelihood. We develop a corresponding bootstrap approach for model inference and estimation of the small area prediction mean squared error. The applicability of the method is demonstrated on a survey of lakes in the Northeastern US.

Piecewise Linear Mixed Models with Random Change-Points

Wim Van den Noortgate & Patrick Onghena

Faculty of Psychology and Educational Sciences, K.U. Leuven

An important application of mixed models is the analysis of longitudinal data. These models often include a time-related covariate, or one or more other characteristics of the measurement points. The effects of these covariates may be defined as randomly varying over persons and this variation may be modelled using person characteristics. A subclass of these models are piecewise linear mixed models, straightforwardly extended to piecewise generalized linear mixed models, assuming that the slopes of a covariate can change at one or more points. These change-points sometimes can be derived from the way the study is set up, or may be suggested by an underlying theory. It is, however, also possible to estimate the unknown change-points of piecewise mixed models, which becomes a more challenging problem if these change-points are regarded as randomly varying over persons, and if one is interested in an explanation of this variation. The models will be discussed, and illustrated using data from the Research group for Stress,

Health, and Well-being.

Download