Identification Robust Confidence Sets for Inference on Parameter Ratios with Application to Discrete Choice Models1 Denis Bolduc 2 Université Laval Lynda Khalaf3 Université Laval Clément Yélou4 Université Laval November 13, 2005 1 We would like to thank Jean-Marie Dufour, Mohamed Taamouti, Gary Zerbe and Paravastu Swamy for useful comments. This work was supported by the Canada Research Chair Program (Chair in Environment, Université Laval), the Institut de Finance Mathématique de Montréal (IFM2), the Canadian Network of Centres of Excellence [program on Mathematics of Information Technology and Complex Systems (MITACS)], the Social Sciences and Humanities Research Council of Canada, the Fonds de recherche sur la société et la culture (Québec), and the Chair on the Economics of Electric Energy (Université Laval). 2 Groupe de recherche en économie de l’énergie, de l’environnement et des ressources naturelles [GREEN], Université Laval. Mailing address: Pavillon J.-A.-De Sève, Ste-Foy, Québec, Canada, G1K 7P4. TEL: (418) 656-5427; FAX: (418) 656-2707; Email: denis.bolduc@ecn.ulaval.ca. 3 Canada Research Chair Holder (Environment). Département d’économique and Groupe de recherche en économie de l’énergie, de l’environement et des ressources naturelles [GREEN], Université Laval, and Centre interuniversitaire de recherche en économie quantitative (CIREQ), Université de Montréal. Mailing address: GREEN, Université Laval, Pavillon J.-A.-De Sève, Ste-Foy, Québec, Canada, G1K 7P4. TEL: (418) 656 2131-2409; FAX: (418) 656 7412; email: lynda.khalaf@ecn.ulaval.ca. 4 Groupe de recherche en économie de l’énergie, de l’environnement et des ressources naturelles [GREEN], Université Laval. Mailing address: Pavillon J.-A.-De Sève, Ste-Foy, Québec, Canada, G1K 7P4. Email: cyelou@ecn.ulaval.ca. Abstract We study the problem of building confidence sets for ratios of parameters, from an identification robust perspective. In particular, we address the simultaneous confidence set estimation of a finite number of ratios. Results apply to a wide class of models suitable for estimation by consistent asymptotically normal procedures. Conventional methods (e.g. the delta method) derived by excluding the parameter discontinuity regions entailed by the ratio functions and which typically yield bounded confidence limits, break down even if the sample size is large [Dufour (1997)]. One solution to this problem, which we take in this paper, is to use variants of Fieller (1940, 1954)’s method. By inverting a joint test that does not require identifying the ratios, Fieller-based confidence regions are formed for the full set of ratios. Simultaneous confidence sets for individual ratios are then derived applying projection techniques, which allow for possibly unbounded outcomes. In this paper, we provide simple explicit closed-form analytical solutions for projection-based simultaneous confidence sets, in the case of linear transformations of ratios. Our solution further provides a formal proof for the expressions in Zerbe, Laska, Meisner and Kushner (1982) pertaining to individual ratios. We apply the geometry of quadrics as introduced by Dufour and Taamouti (2005a, 2005b), in a different although related context. The confidence sets so obtained are exact if the inverted test statistic admits a tractable exact distribution, for instance in the normal linear regression context. The proposed procedures are applied and assessed via illustrative Monte Carlo and empirical examples, with a focus on discrete choice models estimated by exact or simulation-based maximum likelihood. Our results underscore the superiority of Fieller-based methods. Key words: confidence set; generalized Fieller’s theorem; delta method; weak identification; parameter transformation; discrete choice; simulated maximum likelihood. Journal of Economic Literature classification: C10, C35, R40. i Contents 1 Introduction 1 2 Statistical Framework 3 3 Confidence Set methods for One Ratio 5 4 Simultaneous Confidence Sets for Multiple Ratios 6 5 Simulation based and empirical illustrations 5.1 Simulation studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Empirical Application: discrete choice models of travel demand . . . . . . . . . 8 10 11 6 Conclusion 14 A The Fieller-type solution for one parameter ratio 16 B Proof of Theorem 1 17 C Projections for individual ratios 18 List of Tables 1 2 3 4 5 Empirical coverage rates for the delta method and the Fieller method based confidence sets for a parameter ratio in a simple binary probit model. . . . . . . . . Empirical coverage rates for the delta method and the Fieller method based confidence sets for a parameter ratio in a multinomial probit model with a logit kernel. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Simultaneous confidence sets for values of total travel time and of out-of-vehicle time from Ben-Akiva and Lerman (1985)’s trinomial logit model; 95% nominal level. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Simultaneous confidence sets for values of time as percentage of net personal income. Simultaneous confidence sets for values of time as percentage of net personal income (continued). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii 11 12 13 14 15 1 Introduction The problem of constructing confidence set (CS) estimates for parameter ratios arises in a variety of econometrics contexts. Important examples include estimation of price and income elasticities in demand systems [see e.g. Deaton and Muellbauer (1980); Banks, Blundell and Lewbel (1997)], and inference on value of time in discrete-choice models for travel demand [Ben-Akiva and Lerman (1985), Ben-Akiva, Bolduc and Bradley (1993), Bolduc (1999)]. The delta Wald-type method is commonly applied to construct confidence intervals (CI) for ratios of parameters or ratios of linear combinations of parameters. In view of its Wald-type form, the method is justified asymptotically for a wide class of models suitable for estimation by consistent asymptotically normal procedures. However, even when the model under consideration is identifiable, parameter ratios involve a possibly discontinuous parameter transformation. More precisely, the ratio is locally almost unidentified (LAU) i.e. is weakly identified over a subset of the parameter space. In such contexts, Wald-type CS methods can have arbitrarily poor coverage, as shown by Dufour (1997). Alternative methods based on generalizing Fieller’s theorem [ Fieller (1940, 1954)] have recently recaptured the attention of theoretical and applied econometricians [see e.g. the recent surveys on weak identification in econometrics by Dufour (2003) and Stock, Wright and Yogo (2002)]. In this paper, we consider Fieller-type simultaneous confidence sets for multiple ratio functions. Fieller (1940, 1954)’s original Theorem proposes a procedure based on inverting a pivotal statistic to obtain an exact CS for the ratio of two means of normal variates. Scheffé (1970) proposes a modification of Fieller’s procedure, which avoids trivial CSs, i.e. CSs which cover the entire real line. Zerbe et al. (1982) [see also Zerbe (1978) and Dufour (1997)] extend Fieller’s theorem in two directions. First, they focus on ratios of parameters in the normal linear regression model. Secondly, they construct multiple confidence regions and simultaneous CSs for several ratios of linear combinations of parameters. In this case, normality still guarantees exact confidence levels. Extensions to ratios of asymptotically normal variates have also been considered [see e.g. Young, Zerbe and Hay (1997)], leading to a generalized Fieller solution.1 As with the delta method, the generalized Fieller approach is based on a consistent asymptotically normal estimator of the parameters whose ratios are under consideration. Yet both methods exploit the latter asymptotic result in fundamentally different ways. In contrast to the delta method which is derived by excluding the parameter discontinuity regions entailed by the ratio functions and which typically yields bounded confidence limits, Fieller-based confidence regions are formed by inverting a test that does not require identifying the ratios. The geometry of inverting test statistics typically leads to possibly unbounded solutions, a pre-requisite for ensuring reliable coverage [Dufour (1997)]. 1 Applications are found more frequently in statistics than econometrics. See e.g. Darby (1980), Selwyn and Hall (1984), Buonaccorsi (1985), Bucephala and Gatsonis (1988), Zerbe (1978), Zerbe et al. (1982), Young et al. (1997). Young et al. (1997) apply Zerbe et al. (1982)’s results to the asymptotic context of linear and nonlinear mixed-effects models. 1 Applications of Fieller’s method in econometrics are scarce. However, related results can be found in the so called weak instruments literature [which is now considerable; see Dufour and Jasiak (2001), Moreira (2003), Kleibergen (2005) and the surveys by Stock et al. (2002) and Dufour (2003)]. The weak instruments problem relates to the problem of estimating ratios through LAU difficulties. As is evident from the above cited surveys, recent work on instrumental models has focused on pivotal (exact or asymptotic) statistics aimed at being robust (invariant) to identification status. This property underlies Fieller’s approach, which we study here in the case of parameter ratios. Our contributions are twofold. First, we address the simultaneous CS estimation of a finite number of ratios, in a generalized Fieller setting, i.e. given a general asymptotically normal parameter estimate. Simultaneity [for definitions and references, see Savin (1984), Dufour (1989), and Scheffé (1959)] implies controlling joint coverage for all CSs, or more formally, controlling the probability that all the confidence expressions made hold jointly. Fieller’s procedure for simultaneous inference starts from a joint confidence region for the full set of ratios, obtained through the inversion of an identification-robust test [as in e.g. Zerbe et al. (1982)]. Simultaneous CSs for individual ratios or for transformations of ratios are then derived from the latter joint region using projection techniques. Such techniques may however raise non-trivial analytical complications. In this paper, we provide simple explicit closed-form projection-based simultaneous CS formulas for linear transformations of ratios. Results hold exactly in the normal linear regression model. Our general solution further provides a formal proof for the expressions in Zerbe et al. (1982) pertaining to individual ratios. The CSs so obtained are not necessarily bounded. We analyze the unbounded and the trivial outcome cases and provide recommendations for practical applications. Our method of proof uses quadric mathematical tools as introduced by Dufour and Taamouti (2005a, 2005b) for inference in instrumental regressions under weak instruments. The latter approach has not been considered (to the best of our knowledge) in the literature on estimating ratios, although as will become clear from our presentation, simple formula conveniently obtain despite the complicated geometric surfaces under consideration. Secondly, we illustrate our theoretical results with focus on discrete choice models estimated by exact or simulated maximum likelihood (SML). We analyze illustrative Monte Carlo and empirical examples. In discrete choice models, Fieller based approaches hold asymptotically, so it is important to assess their performance in finite samples.2 We study a simple binary probit model, and a multinomial probit model with a logit kernel [see Ben-Akiva, Bolduc and Walker (2001), Bolduc (1999)]. Our simulation results can be summarized as follows. As expected, we find that the delta method based CSs have very poor coverage, even in the simplest design considered. In contrast, Fieller’s method performs extremely well, even in the most complicated design considered. We also revisit two empirical studies from Ben-Akiva and Lerman (1985) and Bolduc (1999) on transportation demand. We show that the Fieller and delta methods can lead to dramatically different empirical implications, even with very large samples. Our results 2 Finite sample problems have been documented in some standard discrete choice settings even with linear hypothesis tests; see e.g. Davidson and MacKinnon (1999); see also Savin and Würtz (1999) and Savin and Würtz (2001). 2 underscore the superiority of Fieller-based methods. The paper is organized as follows. Section 2 defines our statistical framework. In section 3, to set focus, we discuss the delta and Fieller’s methods in the case of a single parameter ratio. In section 4, we consider the multiple ratio case. Section 5 presents several empirical and simulation based examples and applications. Section 6 concludes. 2 Statistical Framework Consider the general model (Y, {Pθ : θ ∈ Θ}) , Θ ⊂ Rp , p≥1 (2.1) where Y is the sample space and Pθ is a probability distribution over Y indexed by θ = (θ1 , θ2 , ..., θ p )0 . Given a sample of size T , we estimate θ by asy θ̂ = (θ̂1 , θ̂2 , ..., θ̂p )0 ∼ N (θ, Σθ ) (2.2) asy where the symbol ∼ refers to the estimator’s asymptotic distribution, and Σθ is estimated b θ . Parameters of interest include ρ = (ρ1 , ρ2 , ..., ρs )0 where consistently by Σ ρi = hi (θ) = L0i θ/K 0 θ, i = 1, ..., s, s≤p−1 (2.3) and {L1 , L2 , ..., Ls , K} is a linearly independent set of fixed (nonstochastic) p×1 vectors.3 These s ratio functions have the same discontinuity set © ª (2.4) DK = θ ∈ Θ : K 0 θ = 0 which is clearly non-empty. Ratios with the same denominator are encountered in many econometric applications; these include long run elasticities in dynamic demand models, and the economic value of time for several use-specific portions of travel time in transportation research. In this context, marginal Wald-type CIs each with asymptotic level 1 − α can be obtained for each one of the ratios applying the following result, usually known as the delta method: ³ ´ ³ ´ 0 θ̂ θ̂ ∂h ∂h i i asy , i ∈ {1, ..., s} . (2.5) Σθ hi (θ̂) ∼ N hi (θ) , ∂θ ∂θ0 For the same problem, Fieller’s method [see e.g. Zerbe et al. (1982)] inverts a Wald-type test associated with the hypothesis L0i θ − ρi K 0 θ = 0, i ∈ {1, ..., s} . 3 Observe that if s ≥ p, then {L1 , L2 , ..., Ls , K} are linearly dependent. Indeed, if s > p, then it is always possible to express at least s − p elements of the set {L1 , L2 , ..., Ls } as a linear combination of the others, and if s = p, then K is expressible as a linear combination of L1 , ..., Ls . 3 Inverting a test with respect to a parameter means that we collect all the values of this parameter for which the test is not significant. We discuss this motivational case in section 3, with emphasis two fundamental issues: (i) the null distribution of the inverted test holds without assuming the ratio at hand is identified, and (ii) the associated solution is not necessarily an interval. The Fieller and delta methods discussed so far produce non independent CSs with nominal (asymptotic) level 1 − α, for each ratio individually. A joint inference procedure based on combining such CSs has level not smaller than 1 − sα. Here, our aim is to construct alternative simultaneous CSs which ensure overall 1 − α level control, for the vector of ratios ρ, as well as for any linear combination of ρ: (2.6) lw (ρ) = w0 ρ, where w = (w1 , w2 , ..., ws )0 is any known fixed s × 1 vector. This case is discussed in section 4, where we proceed as follows. First, we derive a joint confidence region for the vector ρ whose validity, once again, does not require identifying any of the ratios. Formally, we obtain a subset of Rs , denoted CS (ρ; α), such that (asymptotically) Pr [ρ ∈ CS (ρ; α)] ≥ 1 − α (2.7) for all θ ∈ Θ [i.e. without excluding the discontinuity subset (2.4)], by inverting a joint Waldtype test associated with the joint hypothesis L0i θ − ρi K 0 θ = 0, i = 1, ..., s. We next derive CSs for any transformation of ρ of the form (2.6), by projection techniques applied to CS (ρ; α); of course, by convenient choices for the vector w, our result covers the individual ratios case. To do this, we write CS (ρ; α) in the quadrics form, which allows the application of Dufour and Taamouti (2005b)’s results. The CSs so obtained are simultaneous in the following sense. For any set of m continuous real valued functions of ρ, gi (ρ) ∈ R, i = 1, ..., m, let gi (CS (ρ; α)) denote the image of CS (ρ; α) by the function gi . Clearly, ρ ∈ CS (ρ; α) ⇒ gi (ρ) ∈ gi (CS (ρ; α)), i = 1, .., m, hence Pr [gi (ρ) ∈ gi (CS (ρ; α)) , i = 1, .., m] ≥ Pr [ρ ∈ CS (ρ; α)] . (2.8) Then equation (2.7) implies that (asymptotically) Pr [gi (ρ) ∈ gi (CS (ρ; α)) , i = 1, .., m] ≥ 1 − α , ∀θ ∈ Θ, (2.9) which means that the sets gi (CS (ρ; α)) are: (i) simultaneous [by the definition of simultaneity, see Miller (1981), Dufour (1989), Abdelkhalek and Dufour (1998)], and (ii) identification robust, because (2.9) does not exclude the discontinuity set (2.4). If a tractable procedure is available to derive gi (CS (ρ; α)), then valid CS inference on any arbitrary number of transformations of ρ is feasible ensuring overall level control. Here we provide simple explicit solutions for the case where the functions gi are of the linear form. Throughout the paper, we use the following notation: Is is the s-dimensional identity matrix, and 0s refers to an s-dimensional vector of zeros. 4 3 Confidence Set methods for One Ratio In this section, we consider a special case which serves to present the delta and Fieller methods in their simplest form, as they apply to the ratio function δ (θ) = θ1 /θ2 . Let ¸ · v̂1 v̂12 Σ̂12 = v̂12 v̂2 b θ that corresponds to θ̂1 and θ̂2 . For this problem, the delta method denote the submatrix of Σ leads to the usual Wald-type 1 − α level confidence interval (CI): i h 1/2 1/2 (3.1) DCS (δ; α) = θ̂1 /θ̂2 − zα/2 Σ̂δ , θ̂1 /θ̂2 + zα/2 Σ̂δ , ´ ³ 2 0 Σ̂δ = Ĝ0 Σ̂12 Ĝ, Ĝ = 1/θ̂2 , −θ̂1 /θ̂2 , where zα/2 refers to the two-tailed α-level standard normal cut-off point. Applying Fieller’s theorem to the context at hand, we are lead to consider the restriction θ1 − δθ2 = 0, (3.2) and its associated t-test statistic θ̂1 − δ θ̂2 asy t (δ) = ¡ ∼ N (0, 1) ¢ 1/2 δ 2 v̂2 − 2δv̂12 + v̂1 (3.3) under (3.2). The 1 − α level CS which inverts t (δ) corresponds to the values of δ 0 such that |t (δ 0 )| ≤ zα/2 , or alternatively to the set: ¾ ½ ´2 ³ ¡ ¢ 2 2 FCS (δ; α) = δ 0 : θ̂1 − δ 0 θ̂2 ≤ zα/2 v̂1 + δ 0 v̂2 − 2δ 0 v̂12 . (3.4) Aδ 20 + 2Bδ 0 + C ≤ 0, (3.5) This requires solving the following second degree polynomial inequality for δ 0 : A= 2 θ̂2 2 − zα/2 v̂2 , 2 B = −θ̂1 θ̂2 + zα/2 v̂12 , C= 2 θ̂1 2 − zα/2 v̂1 . (3.6) In appendix A, we discuss the solutions to the latter inequality [see Scheffé (1970), Zerbe et al. (1982) and Dufour (1997)]. Two properties are worth noting. First, FCS (δ; α) cannot be an empty set. Second, FCS (δ; α) is either a bounded interval, an unbounded ¯ interval,¯or the entire ¯ ¯ real line ]−∞, +∞[, where the unbounded solution occurs only when ¯θ̂2 / (v̂2 )1/2 ¯ < zα/2 , i.e. when the Student’s t-test of the hypothesis θ2 = 0 is not significant at level α. 5 4 Simultaneous Confidence Sets for Multiple Ratios In this section, we consider the joint estimation of several ratios of the form (2.3). As proposed in section 2, we invert a Wald test of the following linear hypothesis associated with the s ratios under consideration: RHθ = 0 ⇔ L0i θ − ρi K 0 θ = 0, · ¤0 £ H = L1 . . . Ls K , R = i = 1, ..., s, −ρ1 . Is . . −ρs ¸0 (4.1) . (4.2) The s × (s + 1) matrix R has full row rank for any ρi , i = 1, ..., s; and since the s + 1 vectors {L1 , L2 , ..., Ls , K} are linearly independent, the matrix H has full row rank. Let WRH denote the Wald statistic associated with (4.1): ´−1 ³ ´ asy ³ ´0 ³ 0 0 b RH Σθ H R RH θ̂ ∼ χ2 (s) WRH = RH θ̂ under (4.1). This leads to the following joint CS for the components of ρ: CS (ρ; α) = {ρ ∈ Rs : WRH ≤ cs,α } (4.3) where cs,α refers to the 1 − α percentile point of the χ2 (s) distribution. On observing that WRH can be decomposed as follows [see Zerbe et al. (1982)], ¸0 ³ ´0 ³ ´−1 ³ ´ ·¡ ´−1 ¢³ 0 0 0 b b H θ̂ WRH = H θ̂ H Σθ H H θ̂ − ρ , 1 H Σθ H · ¸ · ¸ ´−1 ¡ ´−1 ¡ 0 ¢³ ¢0 −1 ¡ 0 ¢ ³ 0 0 0 b b ρ , 1 H Σθ H ρ , 1 H Σθ H H θ̂ , ρ ,1 (4.4) we can rewrite (4.3) as: where n CS (ρ; α) = ρ ∈ Rs : ρ0∗ M ρ∗ ≤ 0, ¡ ¢0 o , ρ∗ = ρ0 , 1 ¸ ·³ ¸0 ³ ´−1 ·³ ´−1 ´−1 0 0 0 bθH bθH bθH M = c HΣ − HΣ H θ̂ HΣ H θ̂ ´−1 ³ ´ ³ ´0 ³ bθH0 HΣ H θ̂ − cs,α . c = H θ̂ (4.5) (4.6) (4.7) Applying the projection technique to the latter set is, at first sight, not a trivial problem. Zerbe et al. (1982) provide a solution without proof (based on adapting, informally, Scheffé (1953, 1959)’s simultaneous confidence limits in ANOVA contexts) which does not cover linear transformations of ρ [of the form (2.6)]. Here we solve the latter problem, which also of course provides a formal proof for Zerbe et al. (1982)’s results on the individual ratios. 6 To do this, we use Dufour and Taamouti (2005a, 2005b)’s results on the geometry of quadrics. The set of points that satisfy an equation of the form ρ0 Γρ + Φ0 ρ + γ = 0, where Γ is a symmetric s × s matrix, Φ is a s × 1 vector and γ is a scalar, constitutes a quadric surface. A confidence set for ρ of the form © ª Cρ = ρ0 : ρ00 Γρ0 + Φ0 ρ0 + γ ≤ 0 is a quadric confidence set ( Dufour and Taamouti (2005a, 2005b)). Depending on the values of Γ, Φ, and γ, it may take several forms, including ellipsoids, paraboloids and hyperboloids. So we proceed to express the CS (4.5) in the quadrics form. To the best of our knowledge, the latter approach has not been considered to date in solving (4.5). Partition the matrix M according to ρ∗ = (ρ0 , 1)0 in the form: · ¸ M11 M12 M= , (4.8) M21 M22 0 , and M where M11 is an s × s matrix, M12 is a s × 1 vector, M21 = M12 22 is a scalar. Let £ ¡ 0 ¢ ¤ S1 = Is 0s , S2 = 0s , 1 , (4.9) so ρ = S1 ρ∗ , M11 = S1 M S10 , M22 = S2 M S20 ; then 0 ρ0∗ M ρ∗ = ρ0 M11 ρ + 2M12 ρ + M22 . Hence (4.5) can be written as the following quadric CS: © ª 0 CS (ρ; α) = ρ ∈ Rs : ρ0 M11 ρ + 2M12 ρ + M22 ≤ 0 . Further, if we consider scalar linear transformations of ρ, then the sets ¡ ¢ © ª 0 CS w0 ρ; α = w0 ρ0 : ρ00 M11 ρ0 + 2M12 ρ0 + M22 ≤ 0 (4.10) (4.11) (4.12) are simultaneous confidence sets for w0 ρ, w ∈ Rs \{0}. Using Dufour and Taamouti (2005b)’s general expressions, we can now characterize the explicit form of (4.12). As shown by Dufour and Taamouti (2005b), the result depends on whether M11 is singular or not. Theorem 1 Let M11 , M12 , and M22 be defined by (4.6)-(4.8). Then M11 is nonsingular; specifically, s − 1 eigenvalues of M11 have the same sign as c [defined in (4.7)] and the remaining eigen value has the same sign as ´−1 ³ ´0 ³ ´ ³ bθK K 0Σ K 0 θ̂ − cs,α . (4.13) a = K 0 θ̂ In addition, if we define −1 0 M11 M12 − M22 , d = M12 (4.14) then d > 0 if and only if M11 is a positive definite or a negative definite matrix. This Theorem is proved in Appendix B, where we exploit several useful derivations from Zerbe et al. (1982, Appendix C). Now applying Theorem 1 to Theorems 5.1-5.3 from Dufour and Taamouti (2005b), we obtain the following CS solution. 7 Theorem 2 Let M11 , M12 , M22 and d be defined by (4.6)-(4.8) and (4.14); let w ∈ Rs \{0} and −1 W11 = w0 M11 w, −1 f = −M11 M12 (4.15) and let CS (w0 ρ; α) refer to the projection-based confidence set for w0 ρ defined by (4.12). If all the eigenvalues of M11 are positive, then i ¢ h ¡ CS w0 ρ; α = w0 f − (dW11 )1/2 , w0 f + (dW11 )1/2 . If M11 has at least two negative eigenvalues, then CS (w0 ρ; α) = R. If M11 has exactly one −1 w > 0, then CS (w0 ρ; α) = R; (ii) negative eigenvalue, then the following obtains: (i) if w0 M11 −1 −1 0 0 0 0 if w M11 w = 0, then CS (w ρ; α) = R\{w f }; (iii) if w M11 w < 0, h i h ¢ i ¡ CS w0 ρ; α = −∞, w0 f − (dW11 )1/2 ∪ w0 f + (dW11 )1/2 , +∞ . We thus see that unbounded CSs occur when M11 has negative eigenvalues, which depends on the sign of a [defined in (4.13)]. Interestingly, observe that the sign of a can be linked to a Wald test on the denominator of the ratios. Indeed ³ ´−1 ³ ³ ´−1 ³ ´0 ³ ´ ´0 ³ ´ bθK bθK K 0Σ K 0 θ̂ < c1,α ⇒ K 0 θ̂ K 0Σ K 0 θ̂ < cs,α K 0 θ̂ (4.16) where c1,α denote the (1 − α) percentile point of the χ2 (1) distribution. In other words, if a Wald test of K 0 θ = 0 is not significant at level α, then a < 0 [as defined in (4.13)]. So the case c > 0 coupled by a non-significant (at level α) Wald test on the denominator term implies that M11 has exactly one negative eigenvalue, leading to unbounded or trivial CSs. Theorem 2 yields CSs for individual ratios by convenient choices for w. We prove in Appendix C that for any individual ratio ρi , i = 1, 2, ..., s, the confidence limits we obtain coincide numerically with the solutions of the following quadratic inequality: (4.17) Ai ρ2i0 + 2Bi ρi0 + Ci ≤ 0, ³ ³ ´³ ³ ´2 ´2 ´ 0b b θ K, Bi = cs,α L0 Σ b θ K− L0 b b θ Li . = L0ib θ −cs,α K 0 Σ θ K θ and C θ −cs,α L0i Σ where Ai = K 0b i i i This is exactly the same solution informally suggested by Zerbe et al. (1982). We clearly see that computing the latter CSs does not require any extra cost compared to the delta method. To conclude, observe that Theorem 1 implies, in conjunction with Dufour and Taamouti’s results, that the projection based CS for any linear transformation w0 ρ cannot be empty. Indeed, as may be seen from section 5 in Dufour and Taamouti (2005b), the only case where the CS is empty corresponds to a positive definite M11 with d < 0. Here Theorem 1 conveniently rules out this case. 5 Simulation based and empirical illustrations We focus on a general discrete choice model: the multinomial probit with a logit kernel. Since the properties of standard asymptotics in this class of models are little documented, it is important 8 to assess the performance of both confidence set procedures within this framework. The model can be described as follows. Assuming that each decision maker n (n = 1, . . . , T ) face J discrete alternatives, usual random utility formulations imply: ½ 1 if Uin ≥ Ujn for j = 1, 2, . . . , J (5.1) ς in = 0 otherwise, (5.2) Uin = Xin β + εin , n = 1, . . . , T , i = 1, 2, . . . , J where Uin is the unobservable utility individual n derives from alternative i, ς in designates the choice of individual n, Xin is a (1 × K) vector of observable covariates, and the (K × 1) vector β constitutes the parameter of interest. In this context, the choice probability associated with the alternative i chosen by individual n is defined by: Pn (i) = P (Uin ≥ Ujn for j = 1, 2, . . . , J ). (5.3) For convenience, we write (5.2) in the following compact form: Un = Xn β + εn , where Un = (U1n , U2n , . . . , UJ n )0 and εn = (ε1n , ε2n , . . . , εJ n )0 are J × 1 vectors, and Xn is the (J × K) matrix with rows Xin , i = 1, 2, . . . , J . The model formulation (5.2) is very general. The assumptions regarding the error term i.i.d. εn allow to define several classes of sub-models. For example, assuming εn ∼ N (0, Σ) gives the Multinomial Probit [MNP] model. In this case Pn (i) requires the evaluation of multidimensional integrals, which may be analytically intractable for large choice sets; in particular, when the choice set involves four or more alternatives, the choice probabilities are usually simi.i.d. ulated. Assuming εn ∼ Gumbel leads to the Multinomial Logit [MNL] model, in which case Pn (i) are analytically tractable. We consider the logit kernel formulation that results from a convenient combination of MNP and MNL (see Ben-Akiva et al. (2001)): Un = Xn β + εn εn = W ξ n + µν n , i.i.d i.i.d (5.4) W = F G, (5.5) where ξ n ∼ N (0, IJ ), ν n ∼ Gumbel, G is a diagonal matrix of dimension (J × J ) with non-negative entries on the its main diagonal, the matrix F captures the correlation structure among the error terms (see e.g. Bolduc (1999)) and µ is a selection parameter with values 0 or 1: if µ = 0, we have an MNP specification; on the other hand if µ = 1 and G = 0 (or alternatively, if ξ n is non-random) we get the MNL model. Assuming that G 6= 0 and µ = 1 yields the mixed logit model [MXL], also known as the multinomial probit with a logit kernel. The MXL formulation is attractive because Pn (i) can be conveniently specified, conditionally to ξ n , as: eXin β+Wi ξn , (5.6) Pn (i|ξ n ) = J P X β+W ξ jn j n e j=1 9 where Wj denotes the jth row of W , so Pn (i) obtains by integrating Pn (i|ξ n ) over the domain of ξ n which leads to a J -dimensional unbounded integral. To circumvent the curse of dimensionality, the SML approach (see McFadden (1989)) is typically considered where the multivariate integral is replaced by an approximation obtained by simulation. Using S independent draws [ξ rn , r = 1, ..., J ] from the distribution of ξ n , the empirical mean P̂n (i) = S 1X Pn (i|ξ rn ), S r=1 (5.7) provides a valid estimator for the choice probability Pn (i), which allows to define a simulated likelihood function suitable for estimation applying standard algorithms. Our simulation studies and empirical applications are conducted for specific cases of this general discrete choice model. Conforming with our general notational framework [see section 0 2], the vector of unknown parameters which describes the model is denoted by θ = (β 0 , β̄ )0 , where the sub-vector β is the parameter of interest as defined in (5.2), and β̄ contains the nuisance parameters associated with the various hypotheses on the model’s error term. We will also refer (as set up in section 2) to the components of θ as θ1 , θ2 , ..., θp . 5.1 Simulation studies We first consider, for motivational purposes, a simulation study of a simple binary probit model, which corresponds to (5.1) - - (5.5) with µ = 0, J = 2 and W = I2 . The matrix of covariates includes, in addition to a constant [whose coefficient is denoted θ1 ], two regressors [the coefficients of which constitute our parameters of interest denoted θ2 , θ3 ] drawn independently as standard normal. The parameters are set as follows: θ1 = 1, θ2 = 3.3 and θ3 varies from 2 to 0.0001; the sample size is set to T = 100, 250, 1000, 5000, and 10000. We construct 95%-level CSs for δ = θ2 /θ3 ; based on 10000 replications, we compute the empirical coverage rate for both delta method and Fieller based procedures. Results are shown in Table 1. These results show that the empirical coverage rate of the delta method based CS deteriorates rapidly as the denominator becomes close to zero, no matter how large the sample size. When the denominator value is lower than 0.1, the empirical coverage rate deviates markedly from the nominal level. In contrast, the Fieller-type method, although it is approximate in our application, does not suffer from such problems. These results particularly regarding the delta method are noteworthy, in view of the very simple design considered here. Of course, the same arguments call for studying a more complicated choice model, to assess the Fieller case. So we next consider a MXL model as specified by (5.4) and (5.5) with J = 3, 0 0 0 G = 0 1 √0 , 2 0 0 −1 0 0 0 F = I3 − .6 0 0 1 , 0 1 0 and µ = 1. Xn is composed of K = 5 variables [with coefficient β = (θ1 , ..., θ5 )0 ] that are drawn as 1.5 times a uniform [0 1] distribution. The parameters are set as follows: θi = 3, for 10 Table 1: Empirical coverage rates for the delta method and the Fieller method based confidence sets for a parameter ratio in a simple binary probit model. T θ3 2 1 0.5 0.4 0.3 0.2 0.1 10−2 10−3 10−4 100 DCS FCS 93.30 95.33 90.97 95.82 88.58 95.50 89.00 95.48 82.11 95.34 79.15 95.77 61.59 95.79 22.03 95.66 06.79 95.69 02.42 95.67 250 DCS FCS 94.71 95.29 94.01 95.19 91.11 95.33 90.65 94.90 89.93 94.80 85.48 95.12 72.97 95.08 30.44 94.89 10.17 94.99 02.98 95.19 1000 DCS FCS 95.09 95.38 94.97 95.13 94.38 94.87 93.91 94.93 92.81 95.09 91.36 94.97 85.82 95.13 41.07 95.50 13.90 94.89 04.29 94.79 5000 DCS FCS 95.05 95.11 95.08 95.06 94.86 94.92 94.68 94.69 94.98 95.01 94.07 94.61 91.81 94.99 56.86 95.09 19.56 94.54 06.38 95.41 10000 DCS FCS 94.84 94.93 94.84 95.01 95.14 94.96 94.81 94.88 94.77 94.75 95.10 95.07 93.16 95.22 64.47 94.96 24.15 94.92 07.71 95.34 Note: Numbers reported are empirical coverage rates for the confidence set based on the delta method [in the columns titled “DCS”] and for the one based on the Fieller’s method [in the columns titled “FCS”]. θ3 is the denominator of the ratio and T is the sample size. The nominal confidence level is 95%. i = 1, 2, 4, 5 in all the experiment, whereas θ3 varies from 3 to 0.0001. The sample size is set to T = 1000, 5000, and 10000. We estimate the model by SML, using (5.7) with S = 50 draws.4 We construct 95%-level CSs sets for δ ∗ = θ2 /θ3 , and we compute empirical coverage rates based on 1000 replications. Results are reported in Table 2. Our results here are very similar to the simple binary probit case: the Fieller-type CSs still performs very well, whereas empirical coverage rates for the delta method deviate markedly from the nominal level when the denominator value is lower than 0.1. Coverage of the delta-method CIs somewhat improves with sample size, yet remains quite lower from the nominal level for any sample size. Given the complexity of the model considered, these results illustrate the usefulness of Fieller-based inference. 5.2 Empirical Application: discrete choice models of travel demand Discrete choice techniques are commonly applied to analyze transportation related problems. Here we consider two empirical examples from this literature. Following our simulation studies, we consider a relatively simple set-up where usual maximum likelihood is readily feasible, and a more complicated setting which requires SML. We first consider the three-alternative logit transportation model analyzed in Ben-Akiva and Lerman (1985, Chapters 3, 5 and 7.). The model corresponds to (5.1) - - (5.5) where G = 0, 4 See Walker (2001) and Bolduc (1999) for recommendations on identifying MXL models and for guidelines on the choice for S. Our choice of 50 draws is compatible with the latter guidelines, given our framework. 11 Table 2: Empirical coverage rates for the delta method and the Fieller method based confidence sets for a parameter ratio in a multinomial probit model with a logit kernel. T θ3 3 2 1 0.5 0.3 0.2 0.1 10−2 10−3 10−4 1000 DCS FCS 94.3 95.1 94.8 95.4 93.3 93.7 90.5 93.9 87.8 94.7 82.2 94.1 69.2 93.5 25.9 93.4 7.6 93.9 2.6 94.0 5000 DCS FCS 95.2 94.6 95.0 95.5 94.4 94.5 95.1 94.1 91.6 94.9 90.6 95.1 84.5 94.1 37.4 94.4 13.1 94.7 4.6 94.8 10000 DCS FCS 94.0 93.7 94.8 94.9 94.5 94.2 94.3 93.8 93.6 93.4 91.8 94.3 86.8 94.8 45.7 93.7 15.8 94.2 5.4 93.9 Note: See notes for Table 1. µ = 1 and J = 3. The variables in Xn include two alternative-specific constants, three generic attributes of the travel modes [the coefficients of which constitute our parameters of interest, denoted θ3 , θ4 , θ5 ]: (i) round trip travel time [the sum of in-vehicle and out-of vehicle times], (ii) round trip out-of vehicle time/one-way distance, (iii) round trip travel cost/household income, as well as seven alternative-specific socioeconomic and locational characteristics of worker n. This model is estimated by maximum likelihood using data for a sample of 1136 workers taken from a 1968 survey in the Washington D.C. metropolitan area. In this setting, the economic value of travel time can be defined as the marginal rate of substitution between the time and cost variables.5 In particular, since round trip travel time is the sum of in-vehicle and out-of vehicle times, the value of total travel time is equal to that of in-vehicle time and is given by δ tot = θ3 × household income. θ5 Similarly, the value of out-of-vehicle time is ¸ · θ3 θ4 × household income. + δ out = θ5 θ5 × (one-way distance) 5 (5.8) (5.9) Although various theories of time allocation reveal that the value of travel time can be perceived in different ways, most empirical studies refer to the value of travel time as the amount of money the traveler agrees to pay in order to save one unit of the total duration of his travel [Ashton (1947), De Vany (1974), Truong and Hensher (1985), Bates (1987), Ben-Akiva et al. (1993)]. In a discrete choice framework, when the traveler’s utility function is specified as a linear function of travel cost, travel time and other variables, his evaluation of the value of travel time is, up to a scalar constant, equal to the ratio of the coefficient of the time variable over the coefficient of the cost variable [Truong and Hensher (1985), Bates (1987)]. 12 Table 3: Simultaneous confidence sets for values of total travel time and of out-of-vehicle time from Ben-Akiva and Lerman (1985)’s trinomial logit model; 95% nominal level. Type of travel time Total travel time (δ tot ) Out-of-vehicle time (δ out ) Delta method [-2.655, 30.253] [-4.055, 46.639] ] − ∞, ] − ∞, Fieller method −40.843] ∪ [3.918, −66.877] ∪ [6.760, +∞[ +∞[ Note: The delta method-based confidence intervals are not simultaneous. Ben-Akiva and Lerman (1985) compute point estimates for the two parameter functions δ tot and δ out . Conforming with our notation set in section 2, let h1 (θ) = θ3 /θ5 and h2 (θ) = θ4 /θ5 . If θ5 is close to zero, then the functions δ tot and δ out will be weakly identified. Here, using the estimation results from Ben-Akiva and Lerman (1985, Table 7.1 and Figure 7.1), we obtain 95%-level CSs for the ratios h1 (θ) and h2 (θ). The delta method yields DCS (h1 (θ) ; .95) = [−.0002089, .0023483], DCS (h2 (θ) ; .95) = [−.1974734, 1.1382400], (5.10) whereas the Fieller-type method (refer to section 3) gives FCS (h1 (θ) ; .95) = [−∞, FCS (h2 (θ) ; .95) = [−∞, −.0151209] ∪ [.0003947, −7.2190631] ∪ [.0826500, +∞] , (5.11) +∞] . Clearly, the Fieller type CS which is unbounded is in serious conflict with the delta method based CS. Note that the t-statistic for the round trip travel cost/household income variable [the denominator] is 1.8 (see Ben-Akiva and Lerman (1985, p. 158)), which concurs with the unbounded CS result. However, one important result deserves notice: whereas both deltamethod based CSs cover zero [which suggests that travel time has a non-significant economic value], we see that the Fieller-type CSs exclude this (counter-intuitive) case. We next construct simultaneous CSs for the value of total travel time δ tot and the value of out-of-vehicle time δ out [defined in (5.8) and (5.9) respectively]. For comparison purposes, we also compute the delta method CSs which are not simultaneous. We use sample average values for annual household income (equal to 12900$/year) and for one-way distance (equal to 810 centimiles). Our results are reported in Table 3. Once again, the CSs are in serious conflict; the Fieller-based CSs are unbounded, yet in contrast with the delta-method, they still exclude the zero value. In view of our simulation studies which assess the relative worth of both methods, our results suggest that (despite a close-to-zero denominator) the economic value of time is significant at the 5% level for this data set. We now turn to a more complicated setting. We consider the MNP model with correlated utilities analyzed by Bolduc (1999). Nine alternatives are considered, along with a first-order Generalized Autoregressive error process [Bolduc (1992)]. In our notation, the model obtains from (5.1) - - (5.5) with µ = 0 and various choices for F [refer to footnotes of Tables 4 and 5]. Given the dimensionality and error structure, the model is estimated by SML with 50 or 13 Table 4: Simultaneous confidence sets for values of time as percentage of net personal income, 95% nominal level. Type of travel time Confidence set MNP i.i.d. In vehicle time delta Fieller delta Fieller delta Fieller [117.90, 285.52] [95.77, 352.52] [240.79, 450.66] [222.08, 548.30] [453, 1093.37] [370.12, 1350.40] Walking time Waiting time SML MNP R = 50 homoscedastic [122.07, 300.46] [101.77, 382.24] [239.39, 468.58] [223.13, 590.17] [507.58, 1201.34] [437.36, 1533.36] SML MNP R = 50 unconstrained [102.69, 265.37] [61.88, 411.03] [178.65, 397.82] [141.48, 631.17] [286.99, 830.65] [178.70, 1373.47] Note: See notes at the end of Table 5. 250 draws6 , given a data bank on the choice of transportation modes for the morning peak journey to work in the central business district of Santiago. The covariates considered include mode-specific dummies, a sex dummy, a dummy for no cars/no permit holders, cost/income, and three specific uses of travel time [the coefficients of which are our parameters of interest]: walking time, in-vehicle time, and waiting time. The ratio of the latter coefficients with respect to the coefficient of the cost/income variable yields three specific values of time which we aim to estimate. From the estimation results reported in Bolduc (1999), we obtain simultaneous CSs for the three ratios. For comparison purposes, we also compute the delta method CSs which are not simultaneous. Results are summarized in Tables 4 and 5. Here, both procedures yield bounded CSs, which signals no identification problems. As expected, the three components of travel time considered have significant economic values. In all cases, the Fieller-type CSs are wider and cover the ones based on the delta method. Recall of course that the Fieller-type CSs for the three values of time hold simultaneously, which guarantees joint level control. 6 Conclusion This paper considers the problem of constructing CS estimates for parameter ratios and proposes easy-to-compute simultaneous Fieller method-based confidence limits that hold asymptotically under mild regularity conditions. Even in identifiable econometric models, parameter ratios involve a possibly discontinuous parameter transformation that leads to the failure of standard Wald-type based CI estimation methods. We document this problem in discrete choice models, and show that our alternative Fieller-based CSs perform quite well despite the possibility of weak 6 The estimation method combines the SML and the Geweke-Hajivassiliou-Keane (GHK) choice probability simulator based on analytically computed scores. 14 Table 5: Simultaneous confidence sets for values of time as percentage of net personal income, 95% nominal level (continued). Type of travel time Confidence set In vehicle time delta Fieller delta Fieller delta Fieller Walking time Waiting time SML MNP R = 250 unconstrained [110.32, 286.98] [65.89, 457.18] [182.09, 413.40] [143.11, 678.24] [300.35, 879.82] [189.56, 1512.05] SML MNP R = 250 constrained [121.00, 307.96] [76.41, 489.12] [192.22, 439.45] [151.55, 719.44] [310.43, 902.24] [205.82, 1555.02] Note: The delta method-based CIs are not simultaneous. The CSs are obtained from estimating a MNP model with first-order Generalized Autoregressive errors, assuming several error correlation structures. The column entitled “MNP i.i.d.” corresponds to the independent probit model with homoscedastic errors. The column entitled “SML R=50 homoscedastic” corresponds to the MNP with cross-correlated and homoscedastic errors; SML R=50 or 250 refers to the number of draws underlying the SML method. The columns entitled “SML R=50 unconstrained” or “SML R=250 unconstrained” refers to a specification where the specific errors are not constrained to have an equal variance (heteroscedasticity). The column entitled “SML R=250 constrained” corresponds a specification where homoscedasticity is assumed within groups of alternatives (see Bolduc (1999, p. 75)). 15 identification. Since the formula we provide are as simple to calculate as the usual Wald-type ones, our results indicate that the delta method should be avoided in favour of the Fieller one. Unbounded CSs occur when the denominator of the ratios at hand is not significantly different from zero; this might suggest a pre-test sequential inference procedure. Our results show that the Fieller approach integrates this pre-test within the CS estimation procedure, ensuring the global level control. Our methods of proof illustrate the usefulness of the geometric approach to projections introduced by Dufour and Taamouti (2005a, 2005b). Indeed, we have exploited the latter approach to solve the problem of estimating elasticities in commonly used econometric settings. The recent literature on weak-identification (references are provided in the introduction) suggests that projection techniques such as the one we apply here will be gaining popularity in econometric practice. Of course, the results discussed in this paper remain asymptotic beyond the normal regression framework. The development of exact inference methods in discrete choice models is an appealing research objective. Appendix A The Fieller-type solution for one parameter ratio This appendix characterizes the Fieller-type confidence set for one parameter ratio; see Scheffé (1970) and Zerbe et al. (1982). We solve the inequality 2 2 v̂2 , A = θ̂2 − zα/2 Aδ 20 + 2Bδ 0 + C ≤ 0 2 B = −θ̂1 θ̂2 + zα/2 v̂12 , 2 2 C = θ̂1 − zα/2 v̂1 , for real solutions δ 0 . Except for a set of measure zero, A 6= 0. Similarly, except for a set of measure zero, ∆ = B 2 − AC 6= 0. Real roots √ √ −B − ∆ −B + ∆ , δ 02 = . δ 01 = A A exist if and only if ∆ > 0, so FCS (δ; α) = ½ [δ 01 , δ 02 ] ]−∞, δ 01 ] ∪ [δ 02 , if +∞[ if A>0 . A<0 ¢ ¡ If ∆ < 0, then A < 0 ⇒ ∀δ 0 ∈ R, Aδ 20 + 2Bδ 0 + C < 0 and FCS (δ; 1 − α) = R. Indeed, ´ ³ 2 ¡ 2 ¢ 4 2 2 ∆ = v̂12 − v̂1 v̂2 zα/2 + θ̂1 v̂2 + θ̂2 v̂1 − 2θ̂1 θ̂2 v̂12 zα/2 . 16 2 − v̂ v̂ < 0 (by the Cauchy-Schwartz inequality), ∆ is negative if and only if Since v̂12 1 2 2 2 θ̂ v̂2 + θ̂2 v̂1 − 2θ̂1 θ̂2 v̂12 2 < zα/2 . z = 1 2 v̂1 v̂2 − v̂12 ∗ Furthermore, So ∆ < 0 B 2 (A.1) ´2 ³ θ̂ v̂ − θ̂ v̂ 2 12 1 2 2 ¢ < 0. θ̂2 /v2 − z ∗ = − ¡ 2 v̂2 v̂1 v̂2 − v̂12 2 , which establishes that ∆ < 0 ⇒ A < 0. ⇒ θ̂2 /v̂2 < z ∗ < zα/2 Proof of Theorem 1 To prove Lemma 1, we need the following result known as the Sylvester’s law of inertia; refer to Lancaster and Tismenetsky (1985, pp. 184-205) for a proof of this result. Lemma 1 (Sylvester’s law of inertia) Let Π1 and Π2 be any p × p symmetric matrices of the same rank r ≤ p. If Π1 = N Π2 N 0 for some matrix N , then Π1 and Π2 have the same number of positive eigenvalues. b θ is symmetric and positive definite. Then, since H [defined Our framework supposes that Σ ³ ´−1 bθH0 in 4.2] has full row rank, it follows that H Σ is symmetric and positive definite. Similarly, ³ ´−1 bθH0 since S1 [defined in (4.9)] has full row rank, Q = S1 H Σ S10 is symmetric and positive definite. Then, there exists a nonsingular matrix P such that P 0 QP = Is . Using Lemma 1, the two matrices P 0 M11 P and M11 have the same number of positive eigenvalues and the same number of negative eigenvalues. Focusing on the matrix P 0 M11 P , we have: ¡ ¢ P 0 M11 P = P 0 S1 M S10 P ¸ ·³ ¸0 ¶ ¸ · µ ³ ´−1 ·³ ´−1 ´−1 0 0 0 0 b b b − H Σθ H H θ̂ H Σθ H H θ̂ S10 P = P S1 c H Σθ H ·³ ¸ ·³ ¸0 ³ ´−1 ´−1 ´−1 0 0 0 0 0 0 b b b S1 P − P S1 H Σθ H H θ̂ H Σθ H H θ̂ S10 P = cP S1 H Σθ H · ¸· ¸0 ³ ´−1 ³ ´−1 0 0 0 0 b b H θ̂ P S1 H Σθ H H θ̂ . = cIs − P S1 H Σθ H The last expression shows that P 0 M11 P is a patterned matrix of the type discussed in Graybill (1983, p. 206). Thus, P 0 M11 P has s − 1 eigenvalues equal to c and one eigenvalue equal to ¸0 · ¸ · ³ ´−1 ³ ´−1 0 0 0 0 b b H θ̂ P S1 H Σθ H H θ̂ . c − P S1 H Σθ H 17 Zerbe et al. (1982) show that in fact · ¸0 · ¸ ³ ´−1 ³ ´−1 0 0 0 0 b b c − P S1 H Σθ H H θ̂ P S1 H Σθ H H θ̂ = a where a is defined in (4.13). Except for a set of values for θ̂ of measure zero, c 6= 0 and a 6= 0; so, zero is not an eigenvalue of M11 and M11 is nonsingular. The sign of det (M11 ) is the same as the sign of det (P 0 M11 P ) = acs−1 . Applying the same arguments for the matrix M , the sign of det (M ) is the same as that of −1 M12 is a −cs,α cs . In addition, using block matrix inversion formula, and since M22 − M21 M11 scalar, we get: ¡ ¢ −1 M12 det (M ) = det (M11 ) det M22 − M21 M11 ¢ ¡ −1 M12 − M22 = − det (M11 ) M21 M11 = − det (M11 ) d. This implies that d has the same sign as cs,α c/a. So, we have the following results: If a > 0, then c > 0 : all the eigenvalues of M11 are positive, and M11 is positive definite. If c < 0, then a < 0 : all the eigenvalues of M11 are negative, and M11 is negative definite. Clearly, we have d > 0 in these two cases. On the other hand, if (c > 0 and a < 0), we have d < 0; then M11 has at least one positive eigenvalue and at least one negative eigenvalue; thus, M11 is neither positive definite nor negative definite. Lemma 1 is then proved. C Projections for individual ratios Here we use two well known results on matrix inversion from Rao (1973, page 33), which we reproduce for convenience. Lemma 2 When Π11 and Π22 are symmetric matrices, let · ¸ Π11 Π12 Π= . Π012 Π22 Then (assuming the inverses in the latter expression exist): " ¡ ¢−1 0 −1 ¡ ¢ # −1 −1 0 Π−1 Π 0 Π−1 Π Π−1 + Π Π − Π Π Π −Π Π − Π Π Π 12 22 12 12 22 12 −1 12 12 12 11 11 11 . Π = ¡11 ¢11 ¡ ¢11−1 −1 0 − Π22 − Π012 Π−1 Π12 Π−1 Π22 − Π012 Π−1 11 Π12 11 11 Π12 Lemma 3 Let Π be a nonsingular matrix and U and V two column vectors. Then ¡ −1 ¢ ¡ 0 −1 ¢ ¢ ¡ Π U V Π 0 −1 −1 . =Π − Π + UV 1 + V 0 Π−1 U 18 First observe that our CS limits from Theorem 2 correspond to the solutions of a quadratic equation where the sum and the product of the roots are: ¡ ¢ −1 Pe = (w0 f )2 − d w0 M11 w . Se = 2w0 f, Applying Lemma 3 to the matrix M (defined in (4.6)), we have M −1 = = bθH0 HΣ + c 0 H θ̂θ̂ H 0 /c ³ ´−1 0 bθH0 c − θ̂ H 0 H Σ H θ̂ 0 b θ H 0 H θ̂θ̂ H 0 /c HΣ − c cs,α (C.1) (C.2) using (4.6). Now applying Lemma 2 to the matrix M as partitioned in (4.8), we have · ¸ −1 − ff 0 /d −f/d M11 −1 M = −1/d −f 0 /d which implies that ¡ ¢ ¢0 ¡ ¢2 −1 w0 , 0 M −1 w0 , 0 = w0 M11 w − w0 f /d = −Pe/d, ¡ 0 ¢ −1 ¡ 0 ¢0 e = −w0 f /d = −S/(2d), 0s , 1 M w ,0 ¡ 0 ¢ −1 ¡ 0 ¢0 0s , 1 0s , 1 M = −1/d, ¡ so Pe = (w0 ,0)M −1 (w0 ,0)0 , (00s ,1)M −1 (00s ,1)0 Se = 2(00s ,1)M −1 (w0 ,0)0 . (00s ,1)M −1 (00s ,1)0 Applying the latter expressions to (C.2), and assuming that w is the selection vector with one at the ith position and zeros elsewhere (i.e. the vector which selects the ratio with numerator defined by Li ) we obtain ´ ³ b θ Li − K 0 θ̂θ̂0 Li 2 cs,α K 0 Σ 2Bi =− , Se = 0 Ai b θ K 0 − K θ̂θ̂ K 0 cs,α K Σ b θ Li − L0 θ̂θ̂0 Li cs,α L0i Σ Ci i = , Pe = 0 0 0 A b i cs,α K Σθ K − K θ̂θ̂ K which gives exactly the sum and product of the roots of the quadratic inequality (4.17). This shows that both solutions are identical. 19 References Abdelkhalek, T. and Dufour, J.-M. (1998), ‘Statistical inference for computable general equilibrium models, with application to a model of the Moroccan economy’, Review of Economics and Statistics LXXX, 520—534. Ashton, H. (1947), ‘The time element in transportation’, American Economic Review 37(2), 423—440. Banks, J., Blundell, R. and Lewbel, A. (1997), ‘Quadratic engel curves and consumer demand’, The Review of Economics and Statistics 79(4), 527—539. Bates, J. J. (1987), ‘Measuring travel time values with a discrete choice model: A note’, The Economic Journal 97(386), 493—498. Ben-Akiva, M., Bolduc, D. and Bradley, M. (1993), ‘Estimation of travel choice models with randomly distributed values of time’, Transportation Research Records 1413, 88—97. Ben-Akiva, M., Bolduc, D. and Walker, J. (2001), Specification, identification, and estimation of the logit kernel (or Continuous Mixed Logit) model, Technical report, MIT working paper. Ben-Akiva, M. and Lerman, S. R. (1985), Discrete Choice Analysis: Theory And Application to Travel Demand, The MIT Press, Cambridge, MA. Bolduc, D. (1992), ‘Generalized autoregressive errors in the multinomial probit model’, Transportation Research Part B 26B(2), 155—170. Bolduc, D. (1999), ‘A practical technique to estimate multinomial probit models in transportation’, Transportation Research Part B 33, 63—79. Bucephala, J. P. and Gatsonis, C. A. (1988), ‘Bayesian inference for ratios of coefficients in a linear model’, Biometrics 44(1), 87—101. Buonaccorsi, J. P. (1985), ‘Ratios of linear combinations in the general linear model’, Communications in Statistics, Theory and Methods 14, 635—650. Darby, S. C. (1980), ‘A bayesian approach to parallel-line bioassay’, Biometrika 3, 607—612. Davidson, R. and MacKinnon, J. G. (1999), ‘Bootstrap testing in nonlinear models’, International Economic Review 40, 487—508. De Vany, A. (1974), ‘The revealed value of time in air travel’, Review of Economics and Statistics 56(1), 77—82. Deaton, A. S. and Muellbauer, J. (1980), ‘An almost ideal demand system’, American Economic Review 70, 312—326. 20 Dufour, J.-M. (1989), ‘Nonlinear hypotheses, inequality restrictions, and non-nested hypotheses: Exact simultaneous tests in linear regressions’, Econometrica 57, 335—355. Dufour, J.-M. (1997), ‘Some impossibility theorems in econometrics, with applications to structural and dynamic models’, Econometrica 65, 1365—1389. Dufour, J.-M. (2003), ‘Identification, weak instruments and statistical inference in econometrics’, Canadian Journal of Economics 36(4), 767—808. Dufour, J.-M. and Jasiak, J. (2001), ‘Finite sample limited information inference methods for structural equations and models with generated regressors’, International Economic Review 42, 815—843. Dufour, J.-M. and Taamouti, M. (2005a), ‘Further results on projection-based inference in IV regressions with weak, collinear or missing instruments’, Journal of Econometrics forthcoming. Dufour, J.-M. and Taamouti, M. (2005b), ‘Projection-based statistical inference in linear structural models with possibly weak instruments’, Econometrica 73(4), 1351—1365. Fieller, E. C. (1940), ‘The biological standardization of insulin’, Journal of the Royal Statistical Society (Supplement) 7(1), 1—64. Fieller, E. C. (1954), ‘Some problems in interval estimation’, Journal of the Royal Statistical Society, Series B 16(2), 175—185. Graybill, Franklin, A. (1983), Matrices with applications in statistics, Belmont, Calif. : Wadsworth International Group, Belmont. Kleibergen, F. (2005), ‘Testing parameters in GMM without assuming that they are identified’, Econometrica 73(4), 1103—1123. Lancaster, P. and Tismenetsky, M. (1985), The Theory of Matrices, second edition with applications, Academic Press Inc., Orlando, Florida. McFadden, D. (1989), ‘A method of simulated moments for estimation of discrete response models without numerical integration’, Econometrica 57, 995—1026. Miller, Jr., R. G. (1981), Simultaneous Statistical Inference, second edn, Springer-Verlag, New York. Moreira, M. J. (2003), ‘A conditional likelihood ratio test for structural models’, Econometrica 71(4), 1027—1048. Rao, C. R. (1973), Linear Statistical Inference and its Applications, second edn, John Wiley & Sons, New York. 21 Savin, N. E. (1984), Multiple hypothesis testing, in Z. Griliches and M. D. Intrilligator, eds, ‘Handbook of Econometrics, Volume 2’, North-Holland, Amsterdam, chapter 14, pp. 827— 879. Savin, N. E. and Würtz, A. H. (1999), ‘Power of tests in binary response models’, Econometrica 67(2), 413—421. Savin, N. E. and Würtz, A. H. (2001), Empirical relevant power comparisons for limited dependent variable models, in C. Hsiao, K. Morimune and J. L. Powell, eds, ‘Nonlinear statistical modeling : proceedings of the thirteenth International Symposium in Economic Theory and Econometrics : essays in honor of Takeshi Amemiya’, Cambridge University Press, New York, chapter 2, pp. 47—70. Scheffé, H. (1953), ‘A method for judging all contrasts in the analysis of variance’, Biometrika 40, 87—104. Scheffé, H. (1959), The Analysis of Variance, first edn, John Wiley & Sons, New York. Scheffé, H. (1970), ‘Multiple testing versus multiple estimation in proper confidence sets, estimation of directions and ratios’, Annals of Mathematical Statistics 41, 1—29. Selwyn, M. R. and Hall, N. R. (1984), ‘On bayesian methods in bioequivalence’, Biometrics 40, 1103—1108. Stock, J. H., Wright, J. H. and Yogo, M. (2002), ‘A survey of weak instruments and weak identification in generalized method of moments’, Journal of Business and Economic Statistics 20(4), 518—529. Truong, P. T. and Hensher, D. A. (1985), ‘Measurement of travel time values and opportunities cost from a discrete choice model’, The Economic Journal 95(378), 439—451. Walker, J. (2001), Extended discrete choice models: Integrated framework, flexible error structures and latent variables, Technical report, Ph.D thesis, Massachusetts Institute of Technology. Young, D. A., Zerbe, G. O. and Hay, W. W. (1997), ‘Fieller’s theorem, Scheffé simultaneous confidence intervals, and ratios of parameters of linear and nonlinear mixed-effects models’, Biometrics 53(3), 838—847. Zerbe, G. O. (1978), ‘On Fieller’s theorem and the general linear model’, The American Statistician 32(3), 103—105. Zerbe, G. O., Laska, E., Meisner, M. and Kushner, H. B. (1982), ‘On multivariate confidence regions and simultaneous confidence limits for ratios’, Communications in Statistics, Theory and Methods 11(21), 2401—2425. 22