THEORIES AND TECHNIQUES IN HOUSING MARKET ANALYSIS by Joseph A Langsam B.Sc. Massachusetts Institue of Technology (1968) M.Sc. University of Michigan, Ann Arbor (1981) Ph.D. (Mathematics) University of Michigan, Ann Arbor (1982) Submitted to the Department of Urban Studies and Planning and to the Department of Economics in Partial Fulfillment of the Requirements of the Degree of DOCTORATE IN PHILOSOPHY at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY MAY 1983 Joseph A Langsam 1983 The author hereby grants to M.I.T. ermission to reproduce and distribute copies of this thesis cument in whole or in part. Signature of Author: ep tment of Vrban Studies and Planning Certified by: Thesis Supervisor Certified by Second Reader Certified by Thie) Reader Accepted by: Head Ph.D. Com1 htee, Depattment of Urban Studies and Planning Accepted by: Head Ph.D. Cbmmittee, Department of Economics SS. INST. TEC, RA R Rotch THEORIES AND TECHNIQUES IN HOUSING MARKET ANALYSIS by JOSEPH A. LANGSAM Submitted to the Department of Urban Studies and Planning and Economics on May 1, 1982 in partial fulfillment of the requirements for the Degree of Doctor of Philosophy ABSTRACT Housing market analysis whether from the.vantage of urban planning or economics presents both methodological and theoretical problems. The housing market is characterized by search, while market data is frequently only available in cross sectional and aggregated formats. This dissertation contains three principle results which should be for use in housing market analysis. In the area of search theory, it is shown that a search model can have an equilibrium price vector where a commodity can have a nondegenerate equilibrium price distribution. A simple one period urn type search model is analysized and the conditions under which buyers or sellers are made better off by market replication are determined. The buyers bid problem is analysized and it is shown that the bid structure need not be monotonic with respect to time. In the area of estimation and hypothesis testing two results are developed. It is shown that an iterative weighted least squares estimator converges in the sense that for a fixed sample the iterates converge almost surely and also in the sense that the estimator constructed by taking these limit points converges to the true.value of the parameters being estimated and possesses other optimal properties. This analysis corrects an error that appears in the article by Oberhoffer and Kmenta., The final result is the analysis of a multistage heteroskedastic estimator which enables the consistent estimation and hypothesis testing on the structure of the heteroskedasticity. This 2 procedure is a computationall1y simple procedure for performing estimation and hypothesis testing on both the underlying model and on the parameters generating the heteroskedastic structure. The procedure presented in this essay, unlike that which appears in Glejser and Parks papers leads to consistent estimation and consistent hypothesis tests. The dissertation begins with a short introduction to the problems in housing to which the theories and methodologics developed in the thesis can be directed, The principle results are presented in the second and third chapters without their proofs. The mathematical proofs are separated out and presented in the fourth chapter so that the thesis can be used by those researchers whose mathematical interests are minimal. The fourth chapter should be of interest to those who are interested in the application of functional analytic tools to regression theory. Thesis Supervisor: Professor William Wheaton 3 TO MY PARENTS 4 TABLE OF CONTENTS Page ABSTRACT................... DEDICATION................. ..4 ACKNOWLEDGEMENTS........... .. 7 CHAPTER I - INTRODUCTION... .. 9 1. Introduction...... .. 9 2. Discrimination in hous ing .14 3. Estimation of hous ing l e ti;-~ itias ±es.............. rel ed ......o.. ... 25 MARKET SEARCH............. ... 37 1. Introduction.................. ... 37 2. Review of search theory litera ture.... ... 38 3. Equilibrium in search models.. ... 4. A housing search model........ ... 50 5. An optimal bidding problem in a search ... 64 model......................... CHAPTER I I CHAPTER I II - - 44 ESTIMATION AND HYPOTHESIS TESTING IN THE PRESENCE OF HETERO SKEDAS- TICITY................... ...76 1. Introduction.................. ... 2. A theorem of Halbert White.... ...80 3. Block scalar covariance matrix ...85 4. Linear variance model......... .. 102 5 76 TABLE OF CONTENTS Page CHAPTER IV - MATHEMATICAL PROOFS ...................116 1. Introduction.............................116 2. Market search............................117 3. The housing search model..................124 4. Block scalar variance covariance matrix..133 5. Linear variance model....................162 CHAPTER V - SUMMARY AND CONCLUSION................186 BIBLIOGRAPHY......................................189 ACKNOWLEDGEMENTS I would like to thank a number of people who over a ten year span have assisted and supported my efforts to complete this dissertation. My parents, Sid and Helen Langsam gave both moral and financial support to a degree beyond that even expected by a hopeful son. They remained supportive even, when after five years of study in economics and urban studies, I switched fields to earn a Ph.D. in Mathematics. To them, I owe a great debt that cannot be repaid. I am indebted to my advisor Professors Wheaton and Hausmann and to Professor Diamond. They have been most helpful in assisting me to improve the thesis and to navigate the various rules and regulations associated with earning a joint degree. I would also like to acknowledge kindness and the assistance of Professor Eckaus. That this dissertation is too wordy is my fault; I am indebted to Professor Fisher that it is not even more so. I would like to gratefully acknowledge the many mathematicians at M.I.T. and the University of Michigan who spent the time to teach me the value of mathematical analysis and rigorous thinking. 7 * I would publically like to thank my colleagues in the Department of Mathematics at Case Western Reserve University who have listened with understanding to my stories about writing a second dissertation. I want to thank and to acknowledge the contributions of my wife, Betty and my children, Daniel and Jessica. Without them, this thesis would never have been written and my life could never have been as happy. CHAPTER I INTRODUCTION 1. The core of the dissertation is Introduction. comprised of -three distinct investigations. The first of these is directed towards problems arising in search theoretic economic models. A reasonable definition of equilibrium is presented and an existence theorem is proven. A simple one period search model is described and the welfare implications associated with increasing the market size through replication are analysized. The question of optimal bidding in a sequential search model is discussed and an existence theorem for optimal bids is proven. An example is then given which shows that with costly sampling the optimal bid profile need not be monotonic. The second and third investigations are concerned with estimation and hypothesis testing in a linear model in the presence of heteroskedasticity. In the second investigation an iterative weighted least square estimation for a linear model with block scalar variance convariance matrix is described and analysized. A theorem giving sufficient conditions for the successive iterates associated with a fixed sample to 9 , 10 form a Cauchy sequence almost surely as the sample size increases is proven. This theorem corrects an error in the paper by Oberhofer and Kmenta [33] where the argument showing convergence of the successive iterate is faulty. The estimator constructed by taking the limits of the successive iterates is shown to have desirable asymptotic properties. The final investigation is directed towards estimation and hypothesis testing when the variances follow a linear model. A simple procedure is described and analysized for estimating and performing hypothesis tests on the variance model and then for estimating and performing hypothesis tests on the original model. Unlike the procedure given in Glejser [//], the asymptotic distribution of the estimator for the variance model can be computed. The proofs of the theorems showing consistency of this estimator and hypothesis tests is an extension of the work given in White [51] It is not an immediately consequence of White's procedure in that his procedure would require inputs that are not observable. This proposed multiple step procedure is then shown to generate an estimator for the underlying linear model with optimal asymptotic properties and with easily computed asymptotic 11 distributions. It is a simple consequence of the theorems proven, that this procedure generates a simple test for heteroskedasticity. The motivation for these essays comes from problems encountered in urban planning, most particularly in housing and manpower planning. The labor and housing markets are ones in which search plays an important role. They are also markets in which metro- politan and regional data. is collected in a cross sectional rather than a time series formating. Thus heteroskedasticity is likely to be a major problem in the analysis of the regional and metropolitan data. This dissertation is intended to be a contribution in both urban studies and economics. For th.is reason, the dissertation contains a brief discussion of urban problems towards which the theoretical ideas developed in the main body can be applied. A brief discussion of housing discrimination is given since programs directed at housing desegregation tend to aggregate the market and market aggregation or replication is a subject of the first investigation. The attempts to estimate housing demand elasticities is briefly reviewed since this estimation will often be done in models with heteroskedasticity present. Because of 12 the varied purposes towards which the dissertation is directed, the thesis is organized to present the major results first qualitatively,then quanitatively with proofs only sketched, and finally with full mathematical rigour with complete proofs. While this leads to unfortunate redundancy, it does provide the policy maker, technician, and theoretician with the level of generality appropriate for their needs. The remainder of this chapter contains the above mentioned discourse on housing discrimination and on estimation of housing demand elasticities. The second chapter contains the search related results. It begins with a brief review of known results followed by a description of equilibrium in search models and a theorem giving conditions for the equilibrium to existence. A simple consequence of this theorem is that in a search model identical commodities need not have the same equilibrium price. A simple housing search model is presented and the welfare implications of market replication are analysized. The chapter ends with a discussion of optimal bidding in a search model 'containing an existence proof and an example showing that the 13 optimal bid profile need not be monotone with time. The third chapter begins with a general discussion of the nature of the heteroskedasticity problem. The iterative least squares estimator for the block scalar variance covariance matrix is then analysized. Theorems giving condtion for the convergence of successive iterates and for the optimal asymptotic properties of the estimator are stated. White t s procedure is briefly described followed by the description of a new procedures for estimation and hypothesis tests for models where the variance structure follows a linear model. From this pro- cedure an estimator and hypothesis tests for the variance modelare developed. properties of A theorem giving the these tests and estimator is stated. As a corollary a simple new test for heteroskedasticity is presented. Also from this procedure a multiple stage estimator and hypothesis tests for the basic model are developed. A theorem giving condi- tions for these tests and estimator to have desirable optimal asymptotic properties is then stated. The fourth chapter contains a restatement of the theorem of the first two chapters together with complete proofs. It should be of interest to those interested 14 in the application of functional analytic technique to statistical problems. The last chapter contains concluding remarks and suggestions for future applied and theoretical research. 2. Discrimination in Housing. The nation has adopted as policy goals the desegregation of residential neighborhoods and, the increase of housing consumption by low income families. Federal, state, and local governments have instituted a variety of programs seeking these goals including programs which provide: rent subsidies, subsidized new construction, legislative relief through zoning, and market information services. These programs are designed to relieve some barriers that the planner perceives to generate segregated housing patterns. Underlying the choice of programmatic relief must be a theory of market behavior. Since resources are limited, one naturally tries to select those programs which are most cost effective. To do this requires a sufficient knowledge of economic theory and of empirical techniques. [/o3, In p.49D], Stevens outlines the problem when she states: 15 Studies of housing demand in the United States have found significant differences between the behavior of white majority and black In particular, blacks minority households. choose housing of a different tenure class mix, quality, and location from white households. These differences in demands have been said to depend on any or all of the following: (1) blacks' preferences differ from those of (2) blacks consume different quantities whites; and qualities of housing than do whites because blacks face price and entry discrimination in the housing market; and (3) blacks' income, both current and permanent, is lower than whites', causing blacks to consume less housing services than do whites. Identification of the most important cause of the difference between black and white housing consumption patterns is important in devising a housing policy to meet national housing goals. If discrimination is the most important cause of low housing consumption by blacks, then an open housing policy is indicated. If differences in consumption patterns are mainly attributable to differences in current or permanent income, then transfers or policies to increase job skills, labor mobility and employment quality will achieve housing goals. Finally, if tastes differ, housing vouchers or other housing subsidies may be the only feasible way to induce some households to consume an "adequate" level of housing services, however such level of services is defined. Most previous work has found that price and entry discrimination exists in the housing market. Blacks on the average earn smaller incomes than do whites, whether income is measured on a current or a permanent basis. There is, however, disagreement on the portion of demand differences which can be attributed to each of these factors. The possible barriers which preclude integrated neighborhoods as a result of market forces include: 16 differences in tastes between blacks and whites resulting in differing preferences for housing consumption, differing tastes resulting in whites strongly preferring self association, disciminatory practices on the part of sellers and brokers, historically generated endowment differences between blacks and whites, transportation network limitations that result in costly commuting between certain residences and certain job sites, and historically generated housing and work place locational differences between blacks and whites that preclude the free flow of market information. Clearly any subset of these may cause segregated housing patterns. The planner is confronted with the problem of selecting those which make the greatest contribution to market segregation. To choose among these factors those that are dominant, requires both a market structure theory and a means of empirically estimating market parameters. In the area of residential segregation, much effort has been made in determining the importance of these barriers and one can find some of the results in [(), [/lo)o [73 ], an[7 d [/04 ),and [79], [/aS ]. [Wl], [9f ] [ ] [p], [c, [ 1 , [41, [?t] 17 A cental issue is planning to desegregate residential neighborhoods is wherer segregation results from pure preference considerations or from discriminatory practices in the market place. Pure preference considerations can generate segregated housing if either there are housing consumption preference and endowment differences between blacks and whites or if racial considerations enter directly into preference structures. Kern in [?/] using an equilibrium analysis in a housing market model is able to show: that if whites' preference for whites is stronger than blacks' preference for blacks,an integrated equilibrium is unstable and a segregated equilibrium is stable, if blacks prefer whites greater than whites prefer whites no segregated equilibrium exists and a stable integrated equilibrium exists. In the integrated equili- brium all sites have idential racial composition and therefore racial composition has no effect on equilibrium rent distance function. In the segregated equilibrium where whites prefer whites and blacks prefer blacks equilibrium rents on the white's side of the boarder may exceed, equal or fall short of that on the black side. In the-segregated equilibrium where both races prefer white neighborhoods, rents 18 on the white side of boarder exceeds that on the black side provided there are no discriminatory practices. Farley and Bianchi in [73] report a survey that suggests that whites prefer whites while blacks prefer a 50-50 integrated neighborhood. Thus in a segregated equilibrium one expects in the absence of discriminatory practices, that segregated white residential rents will exceed black rents. Miesykowski and Syroy in [7] summarize current economic housing market theory and their findings show that income differences are a small factor leading to segregation, pure preference differences are a significant factor and lead to whites paying a premium for segregation, and that whites overtly discriminate against -blacks resulting in both higher housing prices and limited job opportunities for blacks. in (I] Follain and Malprezzi attempts to empirically test whether blacks pay a premium. They estimate a hedonic model in which race is a variable and using micro level survey data show that blacks receive a discount of about 15% in owner occupied units and 6% in renter markets. This supports a finding that pure preferences are a dominant factor in determining market segregation. 19 The question of whether segregation is a result of pure preferences or discrimination is important in a number of areas. If segregation results from preferences of self association and if residential housing market segregation does not cause disadvantage in other markets, programs to force integration may result in everyone being worse off. evidence, see for example [7,?], There is that housing segre- gation leads to blacks being at a disadvantage in the labor market. In this case, one must understand that improvement in blacks' welfare in the labor market resulting from integration of residence are traded off against losses from not being able to self-associate. If, however, segregation is the result of discriminatory practices, programmatic relief might include both legislative and compensatory programs. Both policy goals and programmatic content are affected by the identification of those barriers which generate residential segregation. Whatever the cause of neighborhood segregation, programs which successfully reduce segregation have the effect of aggregating several nearly independent submarkets into a larger housing market. In many instances a close examination of the housing market 20 reveals that it is comprised of nearly independent submarkets. Some of the factors which lead to this market segmentation include: strong ethnic self-association preferences, transportation networks which makes interzonal commuting costly, physical barriers such as rivers, parklands and large expressways, and historically generated governmental structures. Frequently programs which are directed to these factors have as a secondary effect changes in market segmentation. In markets in which search is characteristic there are frictional costs associated with search; that is costs required to obtain price information and costs associated with making decisions without full knowledge of price structures. The aggregation ob submarkets into a larger market will change these frictional costs. In the first essay, we perform a partial equilibrium analysis for a specialized type of market aggregation to determine the effect upon frictional costs associated with market aggregate. An equivalent formulation of the problem of aggregating several identical markets is to consider the replication of a single market. This latter approach is easier to deal with analytically and is 21 In the market the approach taken in the first essay. are n buyers and m and xn the .th j ith xm times there are, of course, x replicated When the market is sellers. sellers. Each seller has one unit for sale, Y. buyer has a potential bid of for the ir In the replicated market the unit. Y. . has a potential bid of jt for the 13.th buyer unit where seller has a jth The 1 < t < x. and 1 < r < x buyers reservation price or minimum acceptance price of Xi; in the replicated market the reservation price held jt by the All analysis for a single X.. seller is In this time period each buyer fixed time period. independent of other buyers selects from a uniform probability distribution exactly one seller to visit. Buyer will purchase the unit owned by seller I. 'i if buyer Y. 1) . > Y. 1,3 , 2) Y. . > Y. =1 1,3 visit I. where seller Y. J. 3 and either: I J. J for all buyers . ,3 I.,, that J. wins the toss of a fair is the number of buyers s > X. . for all other buyers . that visit and buyer J., visits seller J. with bid Y.... 1 . ,J = Y. 1,3 Ii, s sided die, that visit The same trans- 22 action rules apply to the replicated market. Since we are interested in the effect upon frictional costs associated with market segmentation and aggregation, bid and reservation prices are held fixed. In this model, the frictional cost in the replicated .Ii, market to the buyer can be measured by The frictional the probability of making a purchase. cost in the non-replicated market is given by The frictional costs to the seller P(l,i). in the repli- J. the proba- Q(x,j), cated market has two measures: P(x,i), E(x,j), the expected bility of making a sale, and value of a sale given that one is made. The first step in analyzing the impact an frictional costs associated with aggregation is to compute and Q(x,j), P(x,i), E(x,j). Before giving their values, it is necessary to introduce some additional For an arbitrary set notation. whose bid for the G.(r) = I {I.:Y. that is Let for the G.(r) J Jth/unit let its cardinality, .< S, I[SI denote be the fraction of buyer does not exceed r} r, /m. B.(r) be the fraction of buyer whose bid J jth unit equals r, and let H (r) be the fraction of buyers whose bid for the least let r. Let F(r) j th/unit is at be the fraction of sellers 23 whose reservation price for this unit does not excee d F(r) =||{ J.: X. < r||}/n. Let C. be 3 J J the set of buyers whose b id for the unit owned by J r, that is . is at least as great as i ts reservation price. Finally, let a = m/n be the ratio of buyers to sellers and let =n u(x) x1. It is important to note that those measures that are ratios do not change when the market is replicated. Using the above nota tion it is shown that 1 = P(x,i) 'x xm [H.(Y) 3 l' B (Y ] i'] j FA 1e mB(Y. 1,3 3 [H (Y. .)] -ixm 'l Q(xj) P(x) - 1 - y(x) XJj E (x ,j) iEC. x [H. I(Yi j) -B.(Y. 3 1,3 .) ] Y. . [(x) xm 31 mB. Y. -) H. (X.) J 1, J3 3 3 xm (x) 1 _ While the number of times x [H.(Y. .)] m3 1,3 that a market may be replicated is integer valued (x>O), the expression P(x,i), Q(x,j) and E(x,j) have for each natural extension to smooth function in i and j the x variable. To determine the effects of frictional costs associated with aggregation one can examine the value of 24 3P(x,i) 3Q(x,i) 3x 3xX ' and 3E(x,j) for x>l. We see 'x that for a fixed j if there are at least two buyers with different bids for the j th/ unit at least as large as then both and X and if 3E(x,j) < 0. DX n >2, 'Q(x,) < 0 Thus in almost every case the sellers frictional costs are increased by aggregating equivalent submarkets. Ii, For the buyer the results are not as conclusive, however, the following can be shown: 1) sufficient conditions for 3P (x,i) 0 are that nx>2, A./0, and a>sup[H.(Y.1,3.)-B.(Y. 3 1,J .) 3 2) a sufficient condition for that Ai 3) if cP-(x,i)<0 are nx>2. 0, and nx>2 and a<5/7, As a consequence of 2 above -l <nxin(nx-1) then P(x.i)<0 in markets with a large number of sellers and in which sellers outnumber buyers it will in almost every case be to the benefit of the buyer not to have market aggregation. In general it appears that maintaining segmented markets 25 helps both the buyers and sellers by restricting the competition that buyers feel from other buyers and sellers from other sellers. The exception occurs for the buyer who tends to be a low bidder for each unit he sees and who is in a market with many more buyers than sellers, in this case the low bid buyer would prefer aggregation. In this case, it appears that-aggregation gives the low bidder more opportunities for finding a unit for which he is the only bidder. The above analysis has clear limitations. In addition to assuming that the segmented markets are identical, it is a single period model without dynamic features. Nevertheless, it provides a framework for illustrating the importance of frictional costs that arise wherever search is a consideration. In particular it alerts the planner and program analyst to welfare factors that should be considered whenever programs directly or indirectly affect the parameter of search including market segmentation. 3. Estimation of Housing Related Elasticities. An essential part of the planning process, especially in housing and manpower planning, is the identification 26 of the current status of various market parameters. In the field of housing analysis and planning the important parameter includes price and income elasticities of demand and price elasticity of.supply. f Recall that if 0 X*=(X*,X n 1' 2 ,...,X*) Rn is a point in and then the elasticity of the point by Y X(X*) J Y = R , R , Y=f(x), X. at J is given with respect to is denoted by X* into is a function from n Y.(X*) The elasticity is a -j (X*) J measure of the percentage change in output for a per- centage change in an input. Much of the current debate among housing policy analysts center on whether programs for low income housing should feature supply or demand side subsidies. Central to this debate are questions concerning supply and demand elasticities. If the income elas- ticity of demand is high and the price elasticity of supply is near zero, then income transfer programs that raise the incomes of low income families will tend to just raise housing prices. If on the other hand, the price elasticity of supply is very large, such programs will then result in low income families consuming more housing without large increases in the price of housing. If the price elasticity of demand 27 is near zero, supply subsidies designed to lower the price of housing will not result in substantial increases in housing consumption. Knowledge of these elasticities is also important in designing programs addressed to residential segregation. Suppose higher quality, higher priced housing is found in predominately white neighborhoods and lower quality lower priced housing is found in racially mixed neighborhoods. If low income whites have a higher income elasticity of demand than low income non-whites, an income transfer program that raises the income of low income families will result in white families leaving the integrated neighborhoods to move to the substantially wh.ite neighborhoods. If the non-white income elasticity of demand is the larger, then the same program will result in non-whites moving into segregated neighborhoods and thus increase the amount of residential integration. The usual procedure for estimating part or all of an unknown parameter victor 6 in RK is to speci- fy (assume) a linear relation Y = XS+E vector Y and data matrix X is observable. vector 6 is estimated by b0 1 s (XT where the -1 XTY The and 28 is estimated by the variance of b T -1 ols T 2-(Y-Xb o)s (Y-Xb )~s(X X) w erIe n-K i'-I a is Y nxl (n is the number of observations). vector Among the assumptions which are implicitly made in using this procedure are that the error term has zero mean and follows a homoskedastic a I, that is E(s eT) scalar. Under the usual = where a is a positive assumption, it is not OLS b difficult to show that the estimation s 2 1) b ols 2) in the event that the data matrix and s and are consistent non-stochastic, b ols X 6(bols in BLUE). is asymptotically efficient. has zero mean, but c If, however, the error term follows a heteroskedastic distribution, E(E c ) a non-scalar diagonal matrix, then while bols still consistent, efficiency is lost and s2 a consistent estimator of a2 is is is not so, in particular, the usual hypothesis tests based on performed. is is a best linear b ols unbiased estimator of 3) S of have certain desirable properties including: a2 of distribution, s2 should not be 29 Heteroskedasticity may be introduced into a model because of both theoretical considerations, and as a The most obvious result of measurement procedures. way in which heteroskedasticity is introduced is through data grouping. Y.. = X. 1 13J + e.. . where 13 (1 < i < n, If the model is 1 < j e. . are homoskedastic, 13 < mi.) and the estimated model uses group averages, that is the estimated model is m. Y. X 6 + 6. where Y. L .1Y i1i1 X. - in. m. E. =- te mi c. then the X j=l 13 scedastic distribution. 1 j=; 13 follows a hetero- 1 If the numbers m. are known, one can perform weighted least squares by weighting the vector (Y.,X.) by v'm the optional propertiesof and thus regain some of OLS. If m. are now known, one should perform a weighted least squares procedure which first estimates the m. 1 and then use these estimates to weight the observations. Heteroscedas- ticity can also be present for purely theoretical considerations. The best example is the problems associ- ated with estimating consumption or demand functions. In the simplest case, see [/S,/B) consumption is assumed to be a linear function of income and the model Ct Ao + B Y+ consumption and Y t is estimated where is income. C It is observed that is 30 the variance of ct varies with Yt, and it is easy to develop a theoretical explanation for this phenomenom. For an understanding of the consumption income relationship as well as for prediction purposes it is important to analyze and estimate this variance-income relationship (see U2] for an introduction to the relevance of the heteroskedastic structure to the original model). Heteroskedasticity is likely to be a problem in the estimation of housing demand elasticities both because of grouping and because of the theoretical structure of the error component. There are in the literature a number of attempts to estimate price and income elasticities of demand and own price elasticities of supply with wide ranging results. example, [4], ['/], [70], [3fj], [72], [/0/], [[o], [R], [/o,] and [/7]. [g/), (See, for [7], [F7], The following table gives an overview of the wise range of estimates that have been presented. It is difficult to analytically compare the estimates of different authors since they use different specifications and different measures for the various-variables. To the extent, however, that the concepts of income elasticity and price elasticity has developed into policy 31 and program planning variables, it is interesting to see the variance among these estimates. Dusenberg and Kristin, 1953, [70] Lee, 1968, -. 078 .8 owners .58 renters [f'q] Winger, 1968, n p n y .15 (6) [/o] 1.03 Maisel, Burnham and Austin, 1971, [R] de Leeuw, 1971, Carliner, 1973, [g3-1 .7 -1.5 owners .8 -1.0 renters [] .50 renters King, 1976, [8',]2 Polinsky, 1977, -. 89 .45 .64 [4/] .75 Stegman and Sumka 1978 , [/0O] Polinsky and Elwood, 1979 L3C) McRae and Tuner, 1981 [oi-) 1) .251 .337 .195 current income -.400 (permanent income) (Black Families) .57 - .67 micro -.72 grouped .25 - .39 .89 Their reported elasticity is the coefficient of a linear demand equation. The imputed elasticity would be 0.6. 2) King estimates a Lancastrian demand model and the value 0.6 reported in the table is the imputed 32 elasticity of demand for space with respect to income. A central question in the estimation of income demand elasticity is how to measure income. In general micro level household income demand is not collected. Furthermore it is not clear what concept of income, permanent or current should be used. Polinsky and Ellwood [35] attempt an analysis of the specification error associated with various choices of measures of income and with using micro verses grouped data. This paper critiques the earlier work of Carliner Burnham, [8] , Lee [9/] , Maisel, and Austin [9] , and Winger [/17-] and shows that much of the divergence of income and price elasticities can be explained by mispecification of the income variable. While Polinsky and Ellwood observe that heteroskedasicity will occur due to grouping, they fail to adjust for its presence in the micro model estimation and assume in the grouped estimation model that it arises only from grouping. In the last two technical essays two procedures for estimation and hypothesis testing in the presence of heteroskedasicity are presented and analyzed. the first of these the variance covariance matrix In 33 is assumed to be block scalar with a fixed number of blocks. the form The model to be estimated could be of Y.. = X. . + c.. 1)J1 1 < j < m. 1 where 1) and variance of c. 1 < i < m, 22 =a2 1) i (a. >0). The estimation scheme presented is an interactive weighted least square procedure. a. 2 1t is estimated by 1 B and weighted least squares (weights stage and a. Br+l is estimated by . is computed using = 1). n. . 3=1(Y r+1s_ In the 13 - - 3 2 is computed using the weights derived 2 a. . from the estimated In the ensuing analysis it is shown that the sequence sure as In the first stage max(ri.)-+oo, Sr converges almost it is also shown that the esti- 1 mates for the variance also converge. estimation of 6 and a- 2 The resulting are shown to have the usual desirable asymptotic properties, the estimator for B being asymptotically equivalent to weighted least squares with known true weights. It is also shown that the usual hypothesis tests using the estimates in the weighted model using the weights 2 derived from the estimator a. has the usual known asymptotic distribution. This procedure of inter- active weighted least squares is appropriate when 34 using group data or when the variance of the error term is thought to be related to the values of a discrete variable. In the last essay we present a procedure for estimation and hypothesis testing in the model Y = X +c where the variance covariance model is diagonal with diagonal vector and where Z is observed and 2 a, r a where unknown. 2 = ZF, Starting with the works of Glejser and White we develop an easy to apply multiple step estimation procedure that not only permits hypothesis testing on the estimated r but also yields an estimator for B which is asymptotically equivalent to weighted squares with known weights and for which the usual In the first standard hypothesis tests are valid. stage 6 is estimated by squares estimator and y = Z e , (Z Z) jth2 element is r where bols, the ordinary least is estimated by is the vector whose e2 (Y -X. b )2 Asymptotically, valid hypothesis tests can be performed on W1 2 1 nl1=1 :L r T Z.1 and using y using the estimated as an estimator of the variance covariance matrix where W.2 l 2 - Z.y)2 1 The 35 estimated variances are computed by a2 = Zy weighed least squares are used to reestimate and S. The statistics reported from the ordinary least square regression of the weighted model has the expected known asymptotic distribution. Heteroskedasticity has been an often ignored problem in estimation and hypothesis testing associated with urban economic analysis. In particular it will be present in the estimation of demand elasticities, which are themselves important variables in the formulation of urban housing policies. This dissertation presents two procedures for handling the heteroskedasticity problem. The first of these is appropriate when the variance covariance matrix of the error term can be put in a block scalar form, for example when the variance is related to the values of a variable with finite discrete range. The second procedure is appropriate whenever the variance of the error term is a linear function of observable variables. This latter procedure is particularly appro- priate for use in estimating demand elasticities where the variance of the error term is likely to be a function of income. 36 There are a number of research directions suggested by the works in the dissertation. In the areas of search theory, one should begin to examine the impact of aggregating non-identical markets and should begin to develop multiple period search models in which there is intertemporal dynamics. In particular one wants to investigate how search affects long run housing patterns. In the area of estimation and hypothesis testing in the presence of heteroskedasticity, both the estimators presented here should be compared to other procedures to determine their relative powers. In particular the interactive scheme presented here can be compared to the simple one interation weighted least squares procedure in a Monte Carlo study. Clearly, the next obvious step in the estimation of housing demand elasticity is to redo the study of Polinsky and Ellwood using the heteroskedastic correct procedure developed in the last essay. It will be interesting to learn whether the divergence among elasticity estimations can in part be the failure to correct for heteroskedasticity. CHAPTER II MARKET SEARCH 1. Introduction. In markets with search, either consumers or producers are making decisions with less than full information about commodities and prices. Buyers or sellers are making decisions based upon expected prices and not upon a commonly observed market price. Thus, a priori, one need not expect that a commodity in a search model would have a well defined price. Therefore consumers and producers are faced with not only the decision of how much to buy and sell, but also if, when, and where to make these transactions. Both housing and labor markets are classical examples of search markets. The prospective house buyer usually is unaware of the available stock, its quality and its price while the prospective seller is unaware of the potential market demand. In labor markets, the employer is uncertain of prospective skills of an applicant and of the minimal wage acceptable to the applicant. The applicant is unaware of both job openings and of their potential wage rates. While much of the work done in search 37 ~ 38 theory has drawn from labor market oQfservations for its motivation, many of the results are immediately transferable to analysis of housing markets. The remainder of the chapter is divided into four sections. In the next section is a brief review of search theory literature with most of the articles concentrating on search in labor markets. The third section contains a discussion of equilibria in search module together with a tentative definition of equilibrium and an existance proof. The fourth section contains a simple one period housing search model which is used to analysize the effects upon buyers and sellers resulting from duplicating the market. The last section contains a discussion of optimal bidding in a search framework. 2. Review of Search Theory Literature. a feature in every market. Search is In almost every market where the cost of search~is sufficiently high, transactions are made with the participants possessing less than full information. In some markets, the marginal cost of search is so sufficiently small that behavior in this market is perturbed but slightly 39 from that in a deterministic market with perfect information. In deterministic full information models, however, it is difficult to support nondegenerate price distribution for homogeneous goods, support un- and underemployment of resources, and support the existence of advertising, while in a search model these phenomenon are natural consequences. Despite the obvious importance of search in economic analysis it has only recently been developed in the literature. Economic search litera- ture seems to owe its origin to the two papers by Stigler [Nir]. In these papers, Stigler argues that non-degenerate price distribution might be supportable if the cost of obtaining price information is high. In the later paper, he presents a job search model that will support the job seekers accepting a wage less than the maximum available in the economy. It has been shown, however, that the search strategy presented in this model is suboptimal. Much of the literature subsequent to Stigler's premier articles can be divided into two classes; optimal search strategy, and existence of non-degenerate equilibrium price and wage distributions. The optimal search or optimal stopping time theory that 40 appears in economics also appears in sequential analysis, in statistics, and in the study of stopping times and smartingales in probability. It should probably be a meta theorem that any theory that appears semi independently in three fields has relevance. In any event, optimal search theories are making a contribution in explaining labor market behavior. Much of the optimal search literature has regarded the job seeker's and employer's problem as distinct and separate.* The job seeker's problem is taken to be some variation of the following. The job seeker samples sequentially from the distribution of wage offers incurring a cost for each sample and seeks a rule to tell him when to stop searching and start working. The employer's problem is taken to be a variation of the following. The employer sequentially observes job seekers with a particular marginal product from a distribution incurring a cost for sampling. The employer is seeking a rule telling him when to stop sampling and make a wage offer w. It is surprising * For the purpose of simplicity, the language of labor market analysis is adopted throught this section. 41 It is that these problems are not consistent. assumed that if the employer makes offer, it will be accepted while the job seeker's problem is when to accept an offer. Furthermore in most of the litera- ture, the employer is assumed to have a fixed wage w, so that his decision is only to offer or not offer a job. The complexities of the analysis depends upon the assumptions that are made on the objectives being optimized and the learning process. Some of the variations that have been analyzed in the optimal search strategy literature include assumptions of infinite time horizons no discounting, finite time horizon positive time discount, random number of job offers at each period, underlying wage distribution known, underlying wage distribution learned through a Bayesian process, risk adversion, and wealth constraints with bankruptcy. The literature has also addressed the question of search strategies when one can choose between distributions and search strategy when one can, at a cost, affect the distributions one faces. The former model is used by Wohlstetter and Coleman [10], Kosters and Welch [20), and McCall [u] to explain observed discriminatory behavior in the work place. The latter model has been used by many 42 to explain advertising. Kohn and Shavell [/71, made a substantial contribution by reformulating the optimal search problems in sufficient abstraction and showing that the same analysis applies to both the employer's and job They are then able to show that in seeker's problem. the majority of the variations, the optimal stopping rule is a switch point rule. d, optimal stopping rule n one has not stopped after samples, then there is a number stops at time the That is, if under the (n+l) th/ n+1 s such that one if the utility associated with observation exceeds s and continues * if it is less than s. They are then able to determine what happens to the switchpoint They show that s s under a variety of conditions. falls with an increase in time preference, and with an increase in next period's expected search costs. They are also able for special cases to determine the effect upon s of increased risk in the sense of Rothschild-Stiglitz [37] and in the sense of Diamond-Stiglitz [//] While their analysis holds for other economic problems involving optimal search, Kohn and Shavell have chosen to use the language of the expected utility maximizer. 43 For the reader interested in the mathematics of optimal stopping, Chow, Robbins, and Siegmund [8 ] is an excellent, though difficult reference. Good sources for the probability theory necessary to understand the optimal stopping literature are Ash [3], Chung [1] , and Feller [/3]. Perhaps the greatest motivation for search theory research has been in the analysis of non-degenerate wage distributions and in- the analysis of unemployment. The optimal search strategy has attacked these problems from a partial equilibrium analysis, that is regarding either the employer's or job seeker's behavior as exogenous. Early equilibrium wage models were generally unsuccessful in supporting non-degenerate wage distributions. Indeed, in many of the early models, the wage distribution collapsed to the single monopsony wage. assuming that: This has been shown to be a result of there is a single market, the number of employers in the market is large, the cost of search in positive, employers maximize profits, employees maximize discounted net wages, and the equilibrium distribution is known by all. In the early 1970 1 s a number models in of authors presented which some of the above assumptions were relaxed and 44 which sustained non-degenerate wage distributions. Mirman and Porter [28], Lucas and Prescott [23], and Telser [4fy] have Mortensen [30], Diamond [/0], each presented equilibrium models explaining wage distributions. More recently Varian [5L] has shown that the search structure can explain the existence of sales, [7], Butters uses a search structure for analyzing advertising, and numerous authors have used the search structure for analyzing effects of government policy on unemployment. 3. Equilibrium in Search Models. In this section a definition of equilibrium in search models is presented, and for certain elementary models this equilibrium is shown to exist. Much of the notation and many of the concepts in this section are taken from Arrow and Hahn [2 ]. In an elementary general equilibrium model there n might be firm f a utility h F firms, with possessing a set of feasible production allocation holds distinguished goods, Y- in IR , and H households with house- having an initial endowment function Uh:Rn -+ , xh in and a share JR d(h,f) 45 > 0 Here d(hf) f. of firm and for each f, a consumption A price vector p*, H allocation x* c G Rn , and a production allocation F h=l y* C (DYf constitute a general equilibrium if Sh d(h,f) 1. = f=1 (a) p* > 0, where p* c IRn, j 1,2,...,n p*(j) > 0 = p* > 0, and and for some that is j, p*(j) > 0. * (b) Zh * h f yf h h subject to (c) Yf maximizes p*yf (d) x maximizes Uh(Xh) subject to * * * p Xh < p Xh + yf c Yf. * -E f d(h,f) p yf In the general equilibrium framework households and firms have full market information. No ntility maximizing household will make a purchase of good from firm f price for good if firm i. f i does not post the lowest Since any firm would capture the entire market demand by any undercutting of the market, it is easy to show that in a full information competition market model all firms and households face the same prices. In a search model, price information is not universally distributed. House- holds or firms act upon their expectation of prices. 46 Even though a household's decision of how much to buy might be based upon an observed price, its decision of where to shop is usually based upon price expectation and not upon a full set of observed prices. A single n commonent price vector, since it need not exist in-a search market economy, cannot be expected to erradicate excess demand as it does in a full information general equilibrium model. hold vector In the search model we have for each househ and firm p(f), household h f where a price vector p(h,f) expects firm is the price that firm f p(h,f) and a represents the price f to change and posts. only households are searchers. p(f) In this model, In equilibrium it is reasonable to expect that a household shops where it expects to maximize utility and that for this firm the expected and posted price should agree. This leads to the following definition of equilibrium for a search model. DEFINITION: A price profile consumption vector a choice function equilibrium if: x* p*(hf), p*(f), allocation vector C:H F y*, a and is a competitive search 47 > 0, (a) p*(hf) (b) y* (c) x* over all n maximizes Uh(x) hx there exists f c F with maximizes p*(f) yf subject to xp(f) (d) n p(f) < c*(h) f Z Z + x yf Y* such that d(h,k) p*(k) yj kEf implies that p*(f) xh p*(f) < in (e) > 0 p*(f) (x d(h,k) p*(k) y* + kEF <<y - h c* (h) =f (f) p*(h,c*(h)) = p*(c*(h)). Conditions a-e have obvious interpretations. Condition a is that expected and posted prices satisfy the standard notions of a price, that is they are non-negative and that the price of some good is positive. Condition b is that firms are profit maximizers while conditions c and d are the conditions that individuals are expected utility maximizers. Condition e is that in equilibrium there is not excess demand felt by any firm. Condi- tion f is that the expected price held by a household agrees with the posted price at the firm where 48 the household has transactions. The obvious question is whether such an equilibrium exists for a search model. It will be shown below that the answer is affirmative if we have sufficient continuity conditions on the household demand and production supply functions and if we have a Walras' law type assumption on each demand. We begin by letting F c be a function from H into and listing our assumptions. For each Assumption 1. firm f, that pyf(p) max p into p > 0, p e Rn there is a choice of > py yf(p) + all y(p) y 6 Yf. in and each Yf such Further more the is a continuous map from {p:p>0} Yf. F Assumption 2. p(f) > 0 of all For each O p s f=1 h, and each household xh(p) in Rn such that R there is a choice Uh(xh(p) > max{Uh(x): such that x p(c(h))(x--Th) < d(h,k) p(k) Yk( keF (xn-i ) < -E kEF p(c(h)) further the map {p:p s with 0 f p -+ k)) d(h,k) p(k) yk(p(k)); and xh(p) Rn , p(f) > 01 into is continuous from Rn 49 For each firm O let IR , Zf(p) f f = and price function n) E h xh (P)~ h p in f(P) c (h) =f Walras' law states that ZfeF p(f) zf(p) 0, = we need a somewhat stronger assumption which is as follows. Assumption 3. For no p in &9 Rn f with (the unit simplex) is it the case that implies that p(f)sSn Zf (P)(i) > 0 p(f)(i) = 1. This is a condition on the function c, In essence it states that if for some price function p if there is excess demand in the system then there is some firm good f i f experiencing excess demand for some where the price of good i charged by firm is not the highest price the firm could charge. In a general equilibrium model assumption 3 is a consequence of Walras' law that pz(p) = 0. We prove in Chapter IV the following theorem: Theorem: Under assumptions 1, 2, and 3 a competitive search equilibrium exists with C* = C. The proof of this theorem is a direct application of a Browner fixed point theorem. It is similar to the proof for the existence of a competitive 50 general equilibrium appearing in Chapter and Hahn [2 2 of Arrow J. An interesting corollary on price distributions can be obtained by appending two search models together. Corollary: In a search equilibrium two different firms may post different prices for identical commodities. 4. A Housing Search Model. In the next section I present a simple one period search model which has significance in housing analysis. In the hosing market we find that potential buyers visit (according to some process) sellers to gain information about the characteristics of the unit the seller is offering. The potential buyer, without full information of the housing market and with knowledge of the seller's asking price but not of his reservation price makes a bid on the unit. The potential seller must await bids from buyers and must decide the level of his asking price as well as when to accept a bid which might be below the asking price. 51 It should be clear that the individual search processes in the housing market do not follow any simple model. The housing market is a dynamic market with buyers and sellers learning as they sample. Buyers do not sample at random but rather develop a search strategy. Sellers need not wait for buyers but may and do advertise. Further- more, market brokers (Realtors) exist to facilitate the exchange of price and quantity information. However, the data transmitted by realtors need not always be accurate. A search model which attempts to incorporate each of these factors will be intractible to mathematical analysis. The obvious hope is that as with labor market analysis, a simplified model will capture enough of the behavior to yeild valid analysis. The seller's problem is much the same as that of the job seeker's in labor market models. The seller is faced with a sequence of offers and must decide when to stop sampling and sell. This problem is well researched and the optimal strategy under a wide range of assumptions concerning the seller's objectives is known. The potential buyer's problem is not well developed in the literature. The buyer 52 does not know whether or not a given bid will be accepted and thus this problem is not covered by the optimal search literature for employer's strategies. A simple model of the buyer's problem can be stated as follows: m the buyers samples from among n, At time Associated with each unit is an classes of units. unobserved reservation price below which it will Within class not be sold. j, the reservation price F.(-). J For each sample, the buyer incurs a fixed cost is a random variable with distribution If he purchases a unit from class j c. at period n he then enjoys the payoff, P.(n), J U(X.,Y-P.(n)-nc), where Y = individual income. J J We assume that the probability of drawing a unit for price j from class for all n n. P.(n) Let for unit j and let p.(n). Let sampled at the ni- values n th at the i(n) draw is pj constant be the buyers bid at time P be the function with be the class of unit period and let Z(n) be the actual reservation price for the unit sampled at the nih If the buyer has income draw. bid structure P, he will enjoy pay off. Y and 53 B(YP) = U(Xi(n) Y - nc) (n) P - and if only if P i (n) (n) > Z(n) -1 and P.(j) j < Z(j) (j) for P The buyer's problem is to find bid structure EB(Y,P). which optimizes Let if LP*(n) W(Y,P) optimizes P* = j<n. P*(n+l) J = E[B(Y,P)], W(Y,P), optimizes then it follows that then LP* defined by W(Y-C,P). It is important to note that P define a stopping rule for the process does not i(n) but does define a stopping rule for the process. [i(n), Z(n)]. It is the unobservability of the random variables Z(n), that distinquishes this problem from that solved in the literature. To the best of my knowledge this problem, even under simplified assumptions, has yet to be solved. In the model presented below this problem is finessed by simply assuming the buyer has a bid structure. 54 A number of factors will affect the welfare of One factor which individuals in the housing market. is either affected directly by housing market policies or is indirectly affected by transportation policies is the size of the market. In the next section, I analyze how expanding the market affects through the In search process the welfare of buyers and sellers. particular I show that sellers are made worse off by expansion, and whether or not buyers are made better off depends upon the ratio of buyers to sellers. The basis for the analysis is a one period market model with buyer I where P.. m buyers and Each sellers. n is assumed to have a bid structure , for sellers i is the bid of individual P 13 j's unit. Each seller J. is assumed to have a J for his unit. At the beginning X. J of the period, the buyers are distributed indepenreservation price dently of each other among the sellers such that proba- I. buys unit if I. 1 visits 1i J. J J visits seller I. 1 bility that indivudal 1/k with probability P. . 1,3 > X., P. _ J . 1,3 > P - s,j J. 3 is 1/n. if and only for all See appendix to this essay for further results on this problem. 55 individuals I iisiting sJ individuals going to seller is exactly k. J. and the number of J. P with bid 3 s, ) . = P. 1,3 This one period model can be thought of as an equilibrium model where the market process is such that buyers and sellers are rpelaced as they are successful in the market and in which bids.and reservation prices are independent of experience. * Before continuing with the analysis it is necessary to introduce additional terminology. BASIC Model. I = {I | i = 1,2,...,m} set of Buyers. J = {J I j set of units. = 1,2,...,n} For an arbitrary set cardinality. Y. . 1J1 X. J (n) x S, nx 1 with be the bid of buyer let y I. ||S|| = for unit be the reservation price for unit G (r): =|I{ I Y. < r} II/m denote its J. J J. this is the frac- tion of buyers whose bid for the unit falls below B. (r): =11 {I J. J r I Yi = r}|| /m In general, this assumption will not be consistent with optimal search with positive search costs. . 56 H. (r): J F(r) C.: J I = = = { I == r}II Y..3 |[{J. I X. < rl1 J { Y. > X.}. J 1J1- A.: = {j 1 Y | ual I /n > j if X }; then individ- c A has a bid for unit J. at least as large as its reservation price. PROPOSITION 4.1. I. will make a successful bid is given by: m[H.(Y..) 3 13 1 1y mB (Y.. jEA j iJ J1A Proof.t ful and ders is The probability that individual t The probability that visits unit 1 mH.(Yi.) J(Y..) L 1) I. - B. for unit J. 3 J., is success- and the number of bid- 3 with I. Y ti =Y.. 13 equal to k given by 1 1 mB (Y .)- 1) 1 k-l n n-1 mB (-) k-1 Provided Let Y. P(j,k) 13 > - 3 , )- k n (n-1 m[H-3 (Y. 1 J n (*) (Y it is - B.(Y..)] J 13j 0 otherwise. denote the expression given in then the probability that I i (*), is successful is given tDetailed proofs appear in Chapter IV. 57 by mB.(Y..) B (Y i P(jk) which equals jeA. k=l ~ m[H.i(Y.. J3 1 1 j J1A . mB. J (Y.. iJ mH(Y..) - B.i(Y..)] 1J3 3 1J ) The probability that J. mH. (X.) sold is given by 1 - p 3 is . If unit J. 3 the expected value of the sale is given by is PROPOSITION 4.2. Y.. m[H. (Y..) 3 13 13 - B.(Y. 3 MB.(Y..) . 13. 3 .)] mH.(Y..) 3 13 1J3 mH. (X.) 13 1-py Proof. sold, 1) 3 Unit J. will not be sold only if 3 each bidder with bid for J. at least as great as 3 X. visits a unit other than J.. There are mH. (X.) 3 J33 J bidders with bids for unit J. at least as big as 3 reservation price X.. its The probability of going 3 to a unit other than J. is given by 3 2) If Y.. > X., then the unit J. will be 133 3 Y. . sold for visits J Y . > Y.. tj 13 13 if some individuals with bid and all individuals It go elsewhere. is sold for exactly Y .= s3 1J with bid Hence probability that Y. -) (where Y.. Y. . > X) is J. given 58 m[H.(Y..) mB.(Y..) by i [1- J1j 1 1 J (Y) - J 1 1J which equals mB.(Y..) 3 3 mB.(Y. .) m[H.(Y. .)-B.(Y. .)] k~lik1 mB (Y) The probability that it is sold for 3) (Y X.) Y.. 1J given that it is sold is simply the proba- bility it is sold for Y.. 1J divided by the probability that it is sold. 4) To obtain the result stated, we need only observe that the set of individual with bids for at least as great as J. J is the disjoint union of X. J classes of individuals whose bid for over all J. J equals r r > X.. In a full information deterministic market model a seller can affect the share of the market captured by varying his prices relative to that of other sellers. Proposition 2. tells us that it is a con- struct of this model that the welfare of any particular seller is independent of the reservation prices of other sellers. Proposition 1. states that buyers in this market compete with each other and that the buyer's problem is a game theory problem involving 59 the action of both the other buyers and the sellers. Suppose now each of the individuals in the model is replicated x times to give an expanded market. Our goal is to determine the effects upon the buyer's and seller's welfare from such an expansion. This is a partial equilibrium analysis in that we assume that price structures are not affected. If buyers and sellers determine their bids and reservation prices upon the distribution of bids and reservation prices and independently of the market size then the price structures will not change. Behavior of buyers and sellers independent of market size is suboptimal, however, since changing the size affects the probabilities of being visited and of having competition in a bid. The expanded market is the x time disjoint union of the market in the original model. new set of sellers can be denoted by i = 1,2,...,m, s = 1,...,x}. variables in the new model by a I: = {I. If we denote the state ^, we observe the following relations: J = {J. m = xm Thus the | j = 1,...,n s = 1,. ,x} 60 n = xn y = G B H y(n) ,s) nx-1 = (r) nx G.(r) = j = B.(r) (53,s) (r) J . = H.(r) (r) 3 (3,s) F (r) = F(r) C i e C. = -{(i,s) (3 ,s) 3 j : = {(j,s) A Proof. indivudal xm[H.(Y..) 13 J 1 mB.(Y..) 13 3 = Ux~ A. 1,...,x} will make a I(is) given by successful bid is JeA C. 1,)...,)x} = s c A = In the replicated market model, PROPOSITION 4.3. the probability that s - B.(Y..)] J 1J() ) Use the fact that A(i,s) xmH.(Y..) 11 (x) s=1 A and then use Proposition 4.3. replacing all the variables with their values in the replicated market model. PROPOSITION 4.4. will be sold is of the sale of The probability that 1 - y(x) J( 5 5 ) xmH. (X.) 33 . (3,s) The expected value given that a sale occurs is 61 Yi 1 p'(x) 3 xm[H. (Y. S)-B 31l mB. (Y. .) . 3 1,3 i3C Proof. xmH.(Y..) 1(Y,3)] p3x) - xmH. (X.) J 3 (X) 1 - replacing variables Use Proposition 2. by their values in the replicated market model and C use the fact that USl (jIS) C. We are now able to determine how expending the market through replication affects the welfare of buyers and sellers. Let THEOREM 4.5. 1 P (x,i) I ]EA. xm[H.(Y. .) 1, 3 -P x) 3 B. (Y. . (the probability that buyer the market replicated A. a > [H. (Y. 31 1,) buys a unit I (i,1) times. in a = m/n. Let - B.(Y. .)] 3 1)3 ) P(Xi)- > 0 -1 nx > 2 are and 0. P(x<i) A sufficient condition for b) that sup ] xmH .(Y. Sufficient conditions for a) that x mB. (Y. .) a 1nx-l nx ln ticul ar if nx > 2 and A. nx a < 5/7, 3P(x,i) < 0 ( nx>2). 0. 0 are In par- 62 a One can reasonably interpret as a measure of congestion among buyers or equivalently for a fixed market size, is a rough measure of the competition a Theorem 4.5 between buyers. states that when this competition is low, when there are proportionately more sellers than buyers, buyers are made worse off states that if a Theorem 4.5 by market expansion. buyer tends to be a low bidder on each unit whose reservation price doesn't exceed his bid, then market expansion makes the buyer better off. Market expan- sion affects the buyers by increasing the number of competitiors and by increasing the number of opportunities. gives sufficient conditions Theorem 4.5 for one of these effects to dominate. THEOREM the probability that is replicated x Q(x,j) Let 4.6. J . () ,s) times. Let = 1 xmH. (X.) ( - pi(x) x is sold when the market E(x,j) Y. . . =m 1E.J p(x) xm[H.(Y. .) 3 l' - B.(Y. .)]xmH - p(x) ' 3 xmH. (X.) 1 - p(x) x be .(Y. 3 1i,) .) ' ( be the expe.cted value of the sale of unit J(j s 63 given that a sale takes place in the market replicated x a) Then if times. Necessary and sufficient conditions for < 0 b) nx > 2. are that C. / 0 and Necessary and sufficient conditions for (xj) < 0 YDX Y Y. . > Y. 1, Proof. . . ,J are that there are x i i' with > X.. - 3 See Chapter IV. The interpretation of Theorem 4.6 is straightforward. Expanding the market never is beneficial to the seller. In particular, if there are at least two buyers with different bids exceeding the reservation price, the seller is made worse off by expans ion. 64 5. An Optimal Bidding Problem in a Search Model. In some markets in which search is a prevalent feature, there is sufficient flow of information that buyers and sellers have full knowledge of price distribution. buyers, Search still occurs since prospective although they know price distributions, do not know which seller is posting the lower prices. An example of this is the residential housing market in urban areas where realtors maintain extensive records of past transactions and make these available to prospective home buyers. The potential buyer visits a unit and then may tender a bid for that unit. The decision of how much to bid is in part determined by the bidders expectation of the seller's reservation price and upon the bidder's wealth. In this section it is shown that an optimal bidding strategy exists and that if search is costly this strategy need not result in a bid pattern that is monotonic with respect to time. The bidder samples sequentially and at time n (Xi(n), samples the pair i(n) E {l,2,. equivalently . . ,m} i(n) and is Z(n) Z(n)) e ]R. where Xi(n) observed but Z(n) however the distribution of Z(n) given or is not, i(n) 65 is known and denoted by i(n) that = j F i(n A. J The bidder starts with income c {l,2,...,m}). Y and incurs a fixed cost at time n for P i(n)(n) > Z(n) Let P for X. c > 0 Xi(n), and profile n, Y - nc - Pi(n)(n))* P (j) < Z(j) provided for is the bid i(n) then associated with the bid W(Y,c,P). bidder's problem is to find a bid profile maximizes j < n. P. is the expected payoff P Pi(n)(n) he will enjoy one time be a bid profile, that is at time per each draw If the individual bids U(X i(n), payoff of The probability is fixed and denoted by (j of the sample. ). p* The that W(Y,c,P). The bidder's problem is not too difficult to understand. If he bids too low, he will fail to make a buy and then must incur the cost of additional search. If, on the other hand, his bid is in excess of that needed to make a buy, then the difference of his bid and the minimum needed to secure the buy is lost opportunity. In the model described above, it is assumed that search requires little time and To study the so utility is not time discounted. bid profile, the shift operation where if P is a bid profile, L LP is introduced is a bid profile 66 with (LP), = P (n+l). (n) PROPOSITIC N S.1. with (L) L et (n) = P (n+j). p 1 W(Y,c,P) Li p Then be the bid profile U (X , .Y - c - P (l)) X F (P. (1).) + i=l m + [1 i=l1 Proof. {= 1=1 Pi (1))] F - W(Y c, c, LP). 1 1) m Let h(P,n) Prob Pi(n)(n) > Z(n) > Z(n)} xF (Pi(n)) H(Y,P,n) x U(X , Y - nc - Pi(n)(n) ) F (P (n))/h(P,n) = {expected payoff given P (j) Assume P i(n)(n) > Z(n) and < Z(j)} U(X ,Z) argument for each i. is nondecreasing in the second 67 Then W(Y,c,P) = + h(P,1)H(Y,P,1) [i-h(Pl]h(P,2)H(Y,P,2]+ = [l-h(P,1)][1-h(P,2))h(P,3)H(Y,P,3)+... j=l (1-h(Pi) i=0 ]h(Pj)H(Y,P,j) [h(P,0) E 0] n-1 2) Note. H and 3) 4) [1-h(P,i)]h(P,n) = probiPi(n) _=0 Pi(j) (j) < Z(j) H (Y,P ,n+1) = H(Y-c,LP,n) W(Y,c,P) h(P,1)H(Y,P,1) - h(P,1)H(YP,1) + ~ i-i - j=3 i=1 . (by def h(LP,0) + lh(P,2)H(YP,2) [1-h(P,)] [1-h(P,1)] {h(LP,1)H(Y-cLPl) h(LP,i-l)]h(LP,j-l)H(Y-c,LP,j-1)I = (1 h(P,1)H(Y,P, 1) + [l-h(Prl) j-i _ H1_ m W(Y,c,P) = Z + [ 0) = n=l, 2 , 3... [i-h(LP,i)]h(LP,j)H(Y-c,LPj) 5) > Z(n) ,i))h(P,j)H(Y,Prj) j=3 i=2 (1-h( + < n = h(LPn) for.n=l,2,... h(P,n+l) - j (n) Xi U (X irY-c-Pi(l) i=i X iF i(P i(l) F i(Pi(l)) )]W(Y-c IcLP). + 68 PROPOSITION 5.2. P* Suppose Fi (P* (1)) I. if for some i optimizes W(Y,c,P), 1 l, then LP* optimizes W(Y-c,c,P). Proof. 1) It suffices to show any bid profile, Q. W(Y-c,c,LP*) > W(Y-cc,Q). 2) Suppose W(Y-c,c,Q) > W(Y-c,c,LP*) Let P be the bid profile with Pi(l) = Pi*(l) Pi(n) = Qi(n-1) for n > 2 Then W(Y,c,P) = [1-X. i-x Xi m U(Xi,Y-c-Pi*(l)Fi(Pi*(l))+ n Fi(Pi*(l))]W(Y-ccQ) and 3) W(Yc,P) -m [l-n > Im U(Xi,Y-c-Pi* (1) )Fi (Pi* (1)) + Xi Fi(Pi*(1))]W(Y-c,c,LP*) and 4) W(Y,c,P) > W(Yc,P*) contradiction Proposition 2 states that if we know the optimal structure from time n+1 onward or W(Y- (n+1)c,c,LnP*), we can compute particular Pi*(n) must be if we P bid know ... P*n). the bids that maximizes. In 69 m (A) { i=1 U(X., Y - nc - P. .F.(P.) 11 m + [ .F. (P)] W(Y - nc, The maximum value that (A) W(Y-(n-1)cc,L (n-1)P*) and the value mize (A) P*(n) become Pi*(n). (and hence for c, takes on is then P. P*(l),...,P*(n)) perhaps helpful to note that if F. W(Y,c) is increasing in as expected. 1 reduces to p that maxi- [U(X ,Y-nc-p) = W(Y-nc,c,PnP*)]F.(p). mizes c that maxi- The problem of solving for solving the simpler problem of finding then L P*) W(Y,c) Y = It is sup W(Y,c,P): P and decreasing in In the event that the distribution is associated with a probability measure with finitely many atoms, the search for optimal be restricted to the atoms. P. In the case that can F. are continuously differentiable we can develop further results. Let us now assume that each F. 1 tinuously differentiable in interval lim F (p) = 0. is twice con- (0,o) and that We further assume that for each i p- o U(X ,Z) is twice continuously differentiable in 3U(X ,Z) 3Z 2U > Z> z) 2 (X.,Z) < 0 lz>O Z. 70 lim U(X.,Z) = 0. Z+1 THEOREM 5.3. Under the conditions above an optimum bid profile exists. Proof. 1) W(Y-nc,c) = 0 be the least let for all To find p pute To find n > N, and for > 2c, we can find and of course = M(p). p. > c m(p.) P *(N-1) < 0; p =- Zn 11 and we can com- . U(X.,Y-(N-1)c (Pi *(N-1)) p.*(N-2) ). For m(P ) < 0; 0 < p. < 2c, thus can compute - i D *(N-2) pi < 0, since - m(p) m(Pi) that maximizes and that pi we need to find m(p ) = [U(XiY-(N-2)c-p) W(Y-(N-2)c,c)]F p Then we can U(X.,Y-(N-l)c-pi)Fi(p.). W(Y-(N-2)c,c) maximizes N we need to find a solu- 0 = Thus we can find 4) and -Let is continuous, so there exists - Pi* (N-l') )F for Y-nc < 0. i Z < 0. nc > Y. U(X.,Y-(N-l)c-pi)F(p.) m(p ) that maximizes 3) such that Pi*(N-1) < 0 m(p.) furthermore for = 0. tion to maximize Now for n such that = 0 W(Y-(N-l)c,c) U(X.,Z) < 0 for all n P (n) 2) Since = 0 and is continuous m(P ) W(Y-(N-3)c,c). and 71 5) W(n) = W(Y-nc,c); In general, let pi *(n) we need only find [U(X ,Y-nc-p ) W(n)]Fi(p.). - m(p ) = 0 ous with p Now is continu- m(p.) m(p ) < 0 and = m(p.) that maximizes pi < 0 for to find for so a solution is always possible. pi > Y-nc, In general we seek solutions to the problem of maximize F.(p:) = M(p ) = [U(X ,Y-nc-p ) - W(n)]Fi(p ). 0 for all p. such that we can choose for our optimal m(0) = 0. m(p ) If > 0 pi, for some pi < Y-nc pi = 0 If then and then our pi, optimal solution must satisfy the differential condition. 0 = m'(p) = [U(X,Y-nc-p)-W(n)]F'(p) - F(p) U 2 (X,Y-nc-p). Thus at our optimal solution p, we must satisfy, F'(p) F (p) _ U 2 (X,Y-nc-p) U(X,Y-nc-p) ** - W(n) It is somewhat surprising that the optimal bid profile, Pi*(n), need not be monotonic in n. * We have dropped the subscript ence of notation. **= U2(XZ) = (9U Z (XZ). i for conveni- 72 Pi*(n) maximizes the generic function + (1-F(p))W = M(p) As H(Y-p)F(p) n goes from n both Y M t (P) = 0 = [H(Y-P)-W]F'(P)-F(P)H'(Y-P). and then m" (P) W fall. F'(P)H'(Y-P) + F(P)H"(Y-P) is a unique P = P(Y,W) P solve 9M' (P(Y,W)) 3Y 9P 9Y - such that g 90 3Y 90 - MW If F"(-)<0, F' (P)H' (Y-P) M"(P) < 0 so M(P) = 0. M'(P) n+1 to p = P, At the optimal = [H(Y-P) -W]F"(P) + = 0. and'there Now let Taking M'(P(YW)) W = we find that 0, F(P)H"(Y-P) - H'(Y-P)F'(P) [H(Y-P)-W]F"(P), - 2 F'(P) H'(Y-P) + F (P) H t (Y-P) = F(P)H"(Y-P) - H'(Y-P)F'(P) (P) > 0 F'(P) < 0 M"(P) Since as n increases both Y and W fall there is no conclusive determination of what happens to the optimal bid P *(n). One might suspect and it is easy to construct examples where F. (P) discrete probability distribution and falling as n increases. is a pi*(n) is The following example reveals that it is possible for pi*(n) < p1*(n+1). 73 Example. M = 1 and hence Y = 14 F(-) X Q C = 5 = 1 = 3 P = 2-1/2 is discrete with F(t) 0 t<2.5 .571 2.5 t<3 .6 1 3<t<l00 100<t = 1 U(Xl) U(X,1.5) = 1.05 U(X,6) U(X,6.) Note 1.49 = 1.538 = is chosen such that the above values U(X,-) could be generated from a function U(Xt) > 0 D 2U (X ,t) with U(Xt) < 0. t P*(n) = 0 F(-), Y-3c = 14-15 = -1, Since 1) n > 3. for p*(n) c {2.5, 31 2) follows that = U(X, 4) for n = 1,2. P(2) = 1.5 )(.571 = U(Y-2c-P)F(P) Q F(P)U(X,Y-c-P) - F(Q)U(X,Y-c-Q) 1.134 Furthermore, by the nature of U(Y-2c-Q)F(Q) = U(X,1)(.6) = .6 > .59955 = (1.05)(.571) 3) it follows that P(1) < P(2). = 3 and W(l) + (1-F(P))W(l) = .6 it . = 1.135598 + (1-F(Q)]W(l). So P(L)=P=2.5. 74 Search is naturally a part of the urban economy and particularly of the housing market. There are numerous directions that search theory research might take with respect to urban economics. One area of research which seems especially fruitful, is to develop rational bid rent search models. A rational bid rent search model is one in which buyer's bids and seller's reservation prices are consistent with optimal search strategies given some rule for determining how buyers and sellers are brought together.* It is interesting to note that if in a given bid rent model,** the equilibrium bids satisfy the following conditions: 1) For each I. there is a unique' J. such J0 that B. -1,J0 j(i) = B s,30 i / s. (denote j 0 ). Examples of such roles are: 1) Naive rule each buyer independent of the action of other buyers visits a given seller at random with equi probabilities of visiting one seller. 2) Maximal expected utility: Buyer I. chooses from random with equi probability among those sellers J. that maximize expected utility. 3 Sell Alonzo [1] and Wheaton [5-6]. 75 2) For each j(i) Then if let = there is an J. I. such that jo. X. = sup. B. J reservation prices 1 X. , ., the bids B. . and 1,3 are consistent with the search rule that each buyer visits the seller which maximizes his expected utility. A natural consequence of the full information bid rent model is that individuals with the same income and tastes will end up enjoying the same level of utility. This fact is often exploited in empirical studies to estimate parameters of individuals utility functions and to estimate marginal rates of substitution between various housing characteristics. In a stochastic search model the hypothesis of constant utility for individuals with some preferences and initial income is not supported and hence parameter estimates based upon the assumption of indifference need not be consistent. It remains to determine, however, the degree of inconsistency that this introduces. See Wheaton [57-] . CHAPTER III ESTIMATION AND HYPOTHESIS TESTING IN THE PRESENCE OF HETEROSKEDASTICITY. Introduction. 1. The often encountered one equation liner model can be written as either: K n= and Y where B matrix, B c and a c .Z i=0 xn,i i + 6n Y = X3 + 6 Nxl are Kxl involves the estimation of on B, or predicting E(ce and c ) B, X is an NxK are observed, is a vector of The usual analysis 0. = yn+1, are given or predicted. that Y are unobservable, and E(c) X vectors, vector, random variables with or testing of hypothesis when xn+,l 1'' 'Xn+l,k The usual naive assumption is a scalar diagonal matrix cannot usually be supported. When the observations are generated by time series data one would expect serial correlation, E(c s t) / 0 when s / t, while cross sectional and grouped data frequently 2 imply problems of heteroskedasticity, when i / j. 76 E(2 )/E(e 2 ) 77 E(E T) The naive assumption of to give up. ) 01I is hard =G2 Under this assumption, together with assumptions on the data matrix X, one has the well known Gauss Markov theorem which states that the ordinary least squares estimator ^ T -l T = is a minimum variance estimator. X Y (X X) given by , Furthermore, the predictor y, = xB linear unbiased predictor of y is a best given x,. additional assumptions on the data matrix asymptotic distribution of Q B Under X, the can be computed. If is given by the lim N (X X), then VR(f- ) N has a normal limiting distribution with mean 0 and coaracemtrx covariance matrix E = Y where a2 - XB, 2 a 2Q S -1 Furthermore if S 2= TC N- is a consistent estimator of Under the naive assumption of E(cE ) scalar, one can easily compute the limiting distributions for 63, compute asymptotic confidence intervals for 3, and test, at least using asymptotic theory, linear hypothesis on B. Indeed, most regression packager automatically report all these statistics which of course are valid when E(eT) = E(sE ) is scalar. If Q is not a scalar multiple of the identity, then the statistics reported by the usual regression 78 packages do not have their usual meaning. To examine the loss resulting from using ordinary least 2 When squares. is not a scalar multiple of the identity, we consider the case where the data matrix XTX exists and is non-stochastic and the lim N N equals some KxK matrix Q. The ordinary least T -1 T XY _ squares estimator 1 is given by 1=s(XTX) X (XTX)~ mator = X T(XS+-) (XT X)~ + XT. The OLS esti- is un unbiased estimator of 1 1 1 + p lim 3 = p lim T X X) N N is a consistent estimator of 1 D diagonal operator such that Q = U 1 D 2 U; SY = SX8+Sc. Now, X T If . = Q and 1, so) so is positive is symmetric, we can find a Q definite then, since 1 -1 1 and unitary operator let S = D -1 U, E(SET S )T D U then T UU-T D2 D- = 1, so by the Gauss Markov theorem, T -1 -l T (XT T -1 TT -1 Y X X S SY = (X (- 1)X) (X S SX) 13 D the least linear unbiased estimator for 1 is not efficient. ticity, (S)i= given by (XT X)- Q (a is 3 and thus In the case of heteroskedas- is diagonal and S is diagonal with 2 -1^ The covariance matrix for 1 is -1 E[(X]T T X Tc X(XT -1= E[(-)(-)T] = E[(X X) 2) 2 X IX(XTX) -1 and not by a 0 2(XT -1 79 Therefore /N(3-8) is not asymptotically distri- buted with mean zero and covariance matrix 2 G0 -liT -l 2 = X X) (N 0 Q. printed statistic of Furthermore the commonly - N-K (N X TX) is not a con- sistent estimator of the covariance matrix for /N(6-3). It is clear that the computation of a con- sistent estimator for this covariance matrix requires a consistent estimator for XT QX. In the case of heteroskedasticity, gonal the operator S (a. = squares. If and the model model Q is simply weighted least is known then Y = X3+c SY = SXB+Sc. is dia- is diagonal with and 8 2) 2, Q S can be computed can be transformed to the Not only does OLS on the transformed model give optimal linear estimator for 3, but the usual test statistics computed by the standard regression packages for this transformed model can be correctly interpreted. The identification of the heteroskedastic structure has importance beyond that of statistical consideration. In many instances the data has a natural grouping such that within each group the variances are constant. This may suggest to the 80 investigator that while the group share the same B, structural parameters the proc.esses generating the structure are not the same across groups. The idea that the heteroskedastic structure conveys theoretical information is explored in [/2] In general, Q is not known and difficult to estimate. special forms of Q, 1 XTX .s The solution, at least for is to either develop suffi- ciently good estimators for Q and for S, so that the transofrmed model using these estimators has desirable asymptotic properties or to develop consistent estimator for N XT X so that hypothesis tests can be done using the OLS' estimator S on the original model. In the case of heteroskedasticity, HalbertWhite has proposed a consistent estimator for N~ X TX so that asymptotically valid confidence intervals and tests can be developed using 5. This procedure is explored in the next section. 2. A Theorem of Halbert White. One approach to the problem of hypothesis testing when Q is diagonal has been to look for a simply computable consistent estimator of the convariance matrix associated with 81 the OLS estimator [5q ], White in . proposes such an estimator and as a corollary develops asymptotically valid confidence intervals and hypothesis tests based upon B. Before stating White's results, it is useful to motivate his approach. purpose, assume that X For this is non-stochastic, although this assumption is not necessary to obtain White's results. As we have already seen, the covariance matrix X(3) for is given by and that for -l1TT (N IN( -) X TX) (N - (XTX)(X X)l (XTX) is given by 1 X X) 1 . Since X (N X X) is observable, the problem reduces to developing an estimator for (N 1 XTX). The matrix X can be written as X1 X 2 where X n X is a now vector with XT can be written as is a column vector with (X.) = X . (XT, X ,...XT) T (X. ).= X.. taJ that X is a real matrix so . The matrix where XT [Please note J1 XT = X*.] In the case 82 2 of heteroskedasticity, is a diagonal matrix of the form a1 1 2 2 aY 22 0 0 2 NN The critical matrix N -1 N E a.. 2 N~ reduces to X TX X. T X i=l11 Under fairly general restrictions on X and c, the strong law of large numbers can be evoked to show that lim|l N IN 1 N I - ( 2 - E. ) X2 X. = 0, where || - |1 1=1 RKxK can be taken to be any norm on This suggests 1 *N 2 T that N 1 2 N XT X. would be a good estimator of N -1 XT GX, However, E. but s. 2 like a.. 2 11 is not observable. which equals y. -X.6 is observable, 1 1 1 which leads one to speculate that since strongly consisted estimator of B, B is 83 N N-1 ^ 2 X. T X. i=1 might be a good estimator for N~1XTQX. It seems likely that White followed a similar line of reasoning to obtain the following results. Before stating White's results it is necessary to introduce several new definitions and notation and to enumerate the formal assumptions of his theorems. Al) This we now do. The model is known to be Y i (Xi ,E) where 0 + i=2 is a sequence of independent (not necessarily identically) distributed random vectors, such that X. (a lxK T E(XiE.) satisfy = 0. unobservable while Y. vector) and Ei The scalar valued and 1 The parameter vector 0 X. 1 (a scalar) c are are observable. is a finite unknown Kxl vector to be estimated. A2) (a) and A E(IX There exists positive finite constants such that for all XikI )+s< A , i, E(E:2 1+s) j,k = 1,2,...,K < A and S 84 all n -l - (b) = 1 =) E( XT is nonsingular ,1X sufficiently large and for large let A3) (a) and A n for sufficiently M n > S > 0. There exist positive finite constants such that for all E(fc. 2 X (b) S i Xiki+s) < A j,k ,K 1,2,... = The average covariance matrix is V n: E(E 2. TX) and for n sufficiently i=1 large 6 , V n is nonsingular with Let Bn let i (XTX) -1 XTY = Y. = Finally let R i det V > S > 0. n be the OLS estimator of - X.6 and let i n be a known fixed V qxk let r A4) There exists positive constants be a fixed known that for all j,k,k = i E(IX. i qxl X ik }X.. X = -1 Zn X . n i=1in n matrix and vector. X. iA li+S S and A such < A 1,2,...,K. With the notation developed above, White then. proves the following: LEMMA 2.1. surely for n Given Al and A2, Bn sufficiently large and exists almost 6an aS. n +0' 85 LEMMA 2.2. Under Al V V A3, - M n(n-0) A 0 N( 0)IK) THEOREM 2.3. (i) IVn n - a.s. A (iii) n(R n_ 2 Xq T T Vn Mn -IT- T 1l )T[R(XTX/n)~ Vn (XTX/ RT given the null hypothesis and under Al 3. - Mn Vn (X X/n)~ |(XTX/n) (ii) under Al, A2, A3(a) and A4. 0 in 0. 1 -l R~n-r) H 0 :R A r r - A4. Block Scalar Covariance Matrix. stances a.s. which heteroskedasticity observations can be grouped In some circumis present, the into a small number of groups such that variances are constant within each group. If the data is matrix Q will be of a block scalar from where the so grouped, blocks may have unequal sizes. equivalent Y.. E( = general this is to having a model of the form X.j + ., 2) = In the covariance 2. (j=1,2,.. .,J, i=1,2,... ,N) with In this case one tries to estimate the a. 2 JJand then to use these estimates as weights to 3 transform the data. If the c.. 1J are independently 86 distributed with normal distribution of zero mean and variance a 2, one can use a maximum likelihood procedure to jointly estimate and 0. However, even in this case the computations requires the solution of a nonlinear equation. An alternative pro- cedure, the one proposed in this essay, is to iteraS tively estimate estimator of In the case S E and and Q 2, and then to take as our the limit of their iterations. is normally distributed this becomes the procedure proposed in Oberhofer and Kmenta [33]. They however, use an erroneous argument to show that such a limit exists. In this section, it is shown that this iterative procedure, regardless of the form of the likelihood function does converge and the resulting estimator have the usual desirable asymptotic properties. Proofs for the theorems in this section are given in Chapter IV. 87 Consider th e linear model (j = written as Y = X 0 3 s + XjO0 Y. = + X - E. - + which can also b e i = 1,2,... ,m.) 1,2,...,; .= Y ' '' C. or where: 1) Y.. and 2) X. is a 3) 30 4) Y., 3 1J 1J 13 m.xl vectors with real entries J elements are Y.. and C.. th1 whose fixed vector with real entries; Kxl are J3 vector with real entries; lxK is a C. are real valued; E. . i 1' 1J respectively; 5) X. J3 is a i whose 6) Y ing and 3 N = E X th c Y. is a matrix with real entries m.xK 3 row is given by are vectors formed by stack- Nxl j and 1J where 'and 1,...,J, = m3; and NxK matrix with real entries formed by stacking X. j The problem is to estimate = J. 1,.1..., 0 where and Y X are observed, and where 0 E If 3(E..)= 0 the a2 J and E(c 1J r,s aY2 J (i~) /(r,s) (i,j) = (r,s) are all known, then the appropriate linear estimator is weighted least squares, where (Y X., ) X..) 88 are weighted by If there is no knowledge 1//aV. J about {a jj=l,... ,J} but the likelihood function of is known, then one may use a maximum likeli- (YIX) hood estimator which in general requires the solution of nonlinear equations. An alternative to the maxi- mum likelihood estimator which is also often used when the likelihood function is unknown is a weighted least squares procedure. the true values of a? Estimates of a J replace in the first mentioned J weighted least squares procedure above. In this paper we discuss an iterative weighted least square procedure that can be described as follows. Step 1. Select any them by a 2 ,... 2 J positive numbers 'and denote E Let be the block scalar matrix where 2! 20 2 aT ) o 2 - 1,J 1 ,J 1, 89 2. Let T 1 (XI ZX) -1 T X. Step B (a 1 Y Let Step 2n+1. ' a,2'j' xY) 1 ,)G 2 1 1 m. G2 n+1 ,j (Y. -X.B )T (Y J 3 n J let +1 -XB and N) m1 1 2 cn+1 ,1 m. G2 n+lJ m a2 n+l1,J Step 2n+2. Let B n+1 (a,2 ., 11'**. 2 lJ XY) -(X- n+l X) I - 1 y. XT En+l any limit point Choose as the estimate of {B (a 1 1''' &1J2 jX, .,Y) of n > 1}} _ In this paper we shall show that under fairly general assumption: (1) point for all pairs (2) N > N => I 2 cjYX~ln a. > {B n(a21 (Y,X) For any c > 0 prob{{Bn (011' satisfying the assumptions. Ltere '.. ' ' exists -N < aosuch that XY)1 n>l converges (to a unique limit point) is greater than N = m. has a limit 1-c. Where 90 (3) If an estimator , a1 1 ,... fixed {Bn (a1 1 ,... , 1 is a limit point of O(X,Y) X, Y)n>1 satisfies that for some 0 then p lim Q(X,Y) =0 Nn+o is a consistent estimator of 0 i.e., (4) 0, and Under additional assumption on X and s., the estimator described above has the properties that p lim (0-0) = 0 a) and N+co b) ) N(0,Q^ is asumptotically distributed A(-0) Q where is a fixed positive definite matrix. c) l 2 If we let be given by a. J (Y.-X.0)T(Y.-X.0) and let I be given by ml 2 m 1 a2 m2 2 a2 Q then and Q (Q) " = 1 X T -1 X 2 aj is a consistent estimator of is a consistent estimator of Q~ Q 91 Definition and Conventions. The model: . Y..X.. j 3 +C.. Yij = = 0 +: (j = Yu (j = 1,...,J) Y Y.., = X 0+ g u 0 =X0+ or or where Y., )X. e.., X.j, i = 1 ,...,m.) J, 1,..., Y, X, ., and C, are as described in the previous section. Assumption I: The data matrix X is nonstochas- tic. Assumption II: j such that for each Z XX X < inf ZC k There Z < |1Y.-X.B | , T, m. 3 sup ZTXIX J Z cIR k M. Z < T j For all and all is NxK m. The observed values taken by the dependent variable are realizations of an X < 0. Assumption IV: random vector 0 < X < T II 11=1 Assumption III: inf and any m ZI 11=1 exists Y N element Y = X O +F. which can be written matrix with real entries satisfying assumptions I and II and 60 unknown real numbers. E is an N IX) = E(c) = 0 bance vector with E(E V(£jX) = V(e|X) = V(c) is a Kx1 vector of element disturand with being a diagonal matrix o 92 comprised of blocks, and being of the form: J 2 a1 0 0 2 V(C) -. 2---------- o 0 2 0I I~J identity matrix and I. is the m.xm. J121 J1 a positive unknown real number. where For each 1) Assumption V: j2, either 2 a. c.. 121 is are iid random variables, or j for all 2) m (N) and and all ITAR r- M < o N, such that 2) are independently distributed. {s. . } 1J1 1 (Y-X.B)T(Y.-X.B). 2121 m J 21 When there is no danger of confusion z. (m . . . ,m z.(m DEFINITION: X,Y,B) where N ZN(B). ,j X,Y,B) will simply be denoted by J = DEFINITION: Z - (B)), .. 1 - z (B) or z N (B) 21 21l Z (m ,Y. . .,Ym XYB) = (Z I(B),YZ2 (B) this will frequently be denoted by (Z.(B): = z-.(B)). 21 J1 ,.., Z(B) or 93 DEFINITION: -Z(m ,... Z (B)I X,Y,B) = ,m 0 1 0 Z2(B)I 1 0 2 0 21 where I. is the m xm. identity matrix. 0 0 X(m,... ,m X, Y, B) Z (B) I will also be denoted by -Z(B) or E (B). In the following section, we state and prove our results concerning the iterative estimator. The iterative estimator is constructed by choosing a limit point from the sequence generated by the iterative process. In the following sequence of propositions and theorems, it is proved that the iterative process generate sequences with limit points, that as the sample size increases, the set of limit points degenerate to a singleton almost surely, and the resulting estimator is strongly consistent. PROPOSITION 3.1. Let for observed data matrix (a1 X satisfying assumption I - IV: ,.G.. ) ]R and dependent vector , Y 94 I 1) Z Let 1I1 a2 = 0 0 al 0 2 2 0 -I - a 0 0 -- - - - 2I I Let B 3) Let En+1 = Z(B n) 4) Let Bn+1 A) For each B) The sequence = - XT E1 2) XT T-1X) n+1 1 XT- n+l Then n > 1 Bn exists; has at least one limit {Bn} point; If C) {B } then If the part is a limit point of the sequence B* B* = c.. (XT (B the tuple B*, Y. are normally distributed, then from of Proposition 3.1. c XTI(B*) ) it is easy to see that Z 1 (B*),... ,Z (B*) satisfies the first order conditions for maximizing the likelihood function. To see this, we need only observe that part c of Proposition 3.1.implies that XTE(B*) Y T - X B* = 0 and that (if we consider B X Z(B*) to 95 dh dh = kH(B) be a column vector) X TE(B) IXB]. T [X T(B) e.. vi O(X,Y) 0 with the 2 ( 211,.. .0G3aj) property that there exists B). are normally dis- tributed we have that any estimator 22 Bn (1 '' ''J, Y - is a constant independent of (K Therefore in the case that such that -1 £ R +J is a limit point of ,mi, X, Y) in,... will be consistent. The next result shows that the normality assumption is superfluous. PROPOSITION 3.2. g(B) g:IRk be defined by = (XTE(B)'X)~' XT (B)'Y [=(l XTE B)X)- X7Z(B) of Let g, 1 Y)]. and let YO > 0, F Let d = Sup{|| B-B 0 ||: d < y Let almost N t o. COROLLARY 3.3. Let property that for each ail ,... lalj , such that for each pair 2 point of B c Fl. then under assumptions I-V, surely as number be the set of fixed points {B n a 1,1~ Assumptions I-V, '''l, 0 o N, be an estimator with the there exists positive (perhaps depending upon (X,Y), 2 O(X,Y) ,XY)}. N) is a limit Then under is strongly consistent. 96 PROPOSITION 3.4. the property for each bers 2 2 ,...a a1 Let 0 N, there exists positive num- such that for each pair is a limit point of G(X,Y) be an estimator with (X,Y), J2 {Bn (al 1 2 ''''' X,Y)}. Bn Then under Assumptions I-V, the sequence 2 2 is convergent almost X, Y) 1,1 '''''1,, N + c. surely as The next result is a minor improving of the last proposition. It will however pave the way for showing that almost surely as N is independent of the (a1 1 ... 2, .. PROPOSITION 3. 5. Let g:IR k by g(B) K(N):= = XTE(B) X) gets large, our estimator 1 T (B) , 2) k Y). g(B - g(B2 )11 71B1 B2 11 Sup - B2 | E Rk B1 - B 0 l <1 11 B2 - B0 l <1 0 < Then K(N) 11B < 1/2 1 - B2 1 ' almost surely as N t- selected. be defined Let 97 g(B) = implies be defined as before by Il B- g(B) = JIB, - B0 < 1 and points we have l 11 g(B 1 ) B1 - B2 11 and thus 1[B PROPOSITION 3. 6. as in Proposition 3.5. - B2 1 = are fixed B2 and B1 Since, if both - < 1/2 has a unique g then it follows that fized point. g(B-2 ) 1 < 1/2 0. IRk Let g:IR k Let F = {B E mkg(B) = be defined [(Y.-XiB) (Y.-X.B) mj Let h(B) (2-re) = 1 = B Suppose j[g(Bl) - g(B 2) implies that < 1 B 1 - B 2 1| and that B0 l <1 |J B1 - B 2 jI, (B) ~ Y). X (XT (B) B 2 - B 0j1 H| IRk g:R k Let p. -1/2 m then: 1) F 2) There exists a unique is a singleton almost surely as almost surely as 3) R k F Sup BEF that maximizes h N + o. B* where that maximizes Proof. 1) = B N t C. is the unique element of almost surely as h Proposition IJB - B 0 |C1 B* N + *. 3.2 gives us that almost surely as N t 0. 98 Proposition 3,5 yields that: g(BI) g(B21 Sup 2) B1 -B 1 <l H B2 -B 0 - < 1/2 B2 1 almost surely as l l N t F Proposition 3.1 shows that o. cannot be empty, thus we have; 3) -F a singleton almost surely as is N + co. Proposition 3.1 gives us that there exists B E Rk B is that maximize h a fixed point of g. and furthermore any such The rest follows immedi- ately. COROLLARY 3.7. Let the property that for each a 2(N),... E(X,Y) 2(N) ,j 1) E(X,Y) h(B) = N (2 7 re) there exist positive numbers such that is a limit point of {Bn (a 1 ,1 2 (N),...,a1 be an estimator with O(XY) Bn( 2 (N),X,Y))} 1 2(N),..., J then: maximizes the function (y-1/2 ).T mj~/ (jY.X.B).T(Y--X.B)J -N2 2 almost surely as . mN/ N t o. 99 2) a ®(X,Y) 2 (N) is independent of the choices for ... 2 a Proof. a1 2 j= g. B If almost surely as a For any choice of limit points of points of (N) 2 Bn (a ,.,a .2 a 2 Z (B1 ), = B ,XY) of g, where then it a 2,..., 2 {Bn a 1 1 ''' ,XY) are fixed g, and if F for all n. is the set of fixed points the sequence 2 '0 1 ,J X,Y)} must converge to F is a singleton, pendent of the choice of a ® is always a singleton. F EXAMPLE: Let Ytl Xtl Yt2 Xt2 t =1 1 0 1 0 t =2 2 1 2 -1 B = 0, then g(0) = (5/4)0 = 0. Z 1 (0) = Z 2 (0) is inde- 5/2 of fixed The following example shows that this is not the case. Let B*. a2...,a 2 One might conjecture that the set g Therefore follows that for any choice of Therefore whenever points of the then we have Bna1,'' F = {B*}, 2, Part 1 of the corollary now follows. is a fixed point of if N to. so 100 2 (5/2) 2 - 2. = (2-re) -2 (17/4) h(0) = h(2) (27re) h(2) > h(0) g so 0 does not maximize B / = hence h, Maximizing has at least one other fixed point. we find that h, are also fixed points of g. In this section we analyze the asymptotic distriSince butional properties of our proposed estimator. this estimator equals the maximum likelihood estimator (when the are normally distributed) almost c. surely as the sample size grows large, it is not surprising that it has optimal asymptotic properties. Before proceeding to this analysis we need to introduce the following assumptions-. Assumption VI: 1 N N j=l lim T X. X and -N ±2 - appropriate diagonal matrix. a EN 1 Nl 0 0 0 0 0 -1 m.-(N) 3N The lim R X - N X Assumption VII: 1 XTN XN) j, For each 0 a INj exists exists where is the 101 where I is the m.(N) x M (N) identity matrix. Assumption VII (Replaces Assumption V). N, | j 2 {s..[ J IJ =j 1,2,...,J For all i = 1,2,... ,m.(n)} is a collection of independent and identically distributed random variables. LEMMA 3. 8. A) all B) C) lim a Let m. (N) aN j=1 N 2 a. 2 3 exists and max a. 2> a N 3 j = a N trace 1 lm N-*m -1 X (aN N) -1 min j a. 3 2 N all N = N aN T Then: exists and is positive X definite. PROPOSITION 3.9. with the property that g(B) = (XT Let Q(X,Y) g(O(X,Y) X)-(XTZ(B)~ Y) (B) be an estimator O(X,Y) = where then under the hypothesis above: A. 0 is asymptotically equivalent to the weighted least squares estimator with known variances in the sense that if W is the latter, p lim N(O-W) = 0. N + co B. 0 is asymptotically normally distributed with mean vector (XT -1 -1 B0 1 and variance covariance matrix XT. E-1X -1 102 C. (. X surely as -I 1 - X) ( XT()lX)l -+ 0 almost N + *. It is impossible numerically to identify the limit of a sequence. In paractice, one calls a term of a computed sequence a limit if it differs from the previous term by an amount less in norm than some preassigned tolerance level. The above theorems state that if this procedure does pick a limit of the iterative procedure, then the statistics computed in this last stage have their usual expected asymptotic distribution, that is, one can do asymptotic hypothesis tests and construct asymptotic confidence intervals using this output. 4. Linear Variance Model. Block scalar covariances arise when the variances are structurally related to discrete valued variables. In the case that the variances are structurally related to at least one continuous valued variable, the likely candidate for the variance structure is that it also follows a linear model. component the vector Let 2 (f2) E(n 2.- 2 is be the vector whose 2 given by F- and let ith 2 2 The linear- variance model for be 103 heteroskedasticity assumes that the variance covariance matrix Q.= a2 2 is diagonal with diagonal elements and furthermore that is an observed matrix and matrices X unrelated. 1) and F 1 The may share columns or may be r sufficiently well, that the esti- yields estimator of well used in estimating tests on r is unknown. where The usual problems are to: Estimate mate of Z a 2 = Zr, 6, a2 that can be , performing hypothesis and constructing confidence intervals on S. 2) Test hypothesis on 3) Construct confidence intervals for r. 1. Glejser [/q] and Park [3f] have suggested two similar procedures for estimating and performing hypothesis tests on F. Both of these procedures are supported by heuristic arguments but both are known to lead to inconsistent hypothesis tests. In this section a new procedure which uses White's theorem is proposed which has the advantages of being easy to construct and understand, and which is shown to yield, consistent test statistics for testing hypothesis on P. This procedure yields 104 F estimates for and thus for a2 that can be used in a weighted least squares procedures to estimate 6 13. If is estimated by this multiple stage weighted least squares procedures, the final stage 6 estimate for together with the usual generated statistics have the expected known asymptotic distributions. In this section we give an heuristic description of Glejser's procedure and describe our analysis. In the following section we present and prove our two main theorems. We end the section with some comments about the validity of our assumption and some comments for extending this work. Let the model be given by: (1) i = 1,2,...,n Y.. = X.0 + Ei 10 i i where: (2) E(E.) = 0 1o-. E(EiE ) = 2 ij=3 We also assume that: (3) and a. 2 = z.r 31 0 i = 1,...,n Y. is an observable real valued variable, 1i Z. are vectors of real valued variables, has dimensions 1 x k and Z. 1 x m. It is X. X. 105 possible that 80 Z share common components. 0 F Rk are real valued unob- i (i=1,2,,,.,n) - r0 and are unknown parameter vectors r0 and X servable random variables, a. 2 numbers also unobservable. We assume that the (E vectors i = 1, ... ,Xi, Z.) are positive real ,n are independently distributed. Following Glejser's suggestion we let ^ = T (X X) -l T X Y, in we let = Yi - XOn' squares residual. r ^T= .th i in where and then is the ordinary least r0 Now we estimate 2 (Z Z) -1 zT^ Tn, vector whose 0, the OLS estimate of ^ 2 cn is the column (C. component is by ) 2 . Glejser in then suggests we perform our hypothesis tests on r0 by using .n*If our observed rn "1supports" -2 the model a2 = Z r0, we then use a weighted least square procedure to reestimate our estimate of the variance 0 using a. 2 Z. F as In Glejser's paper, he proposes the model 6 = vP 9(Z g where E(v.= v. 0 3 is known, = v {m 0 + mIf (Z.) + + m f( )]}, are independent random variables, E(v v.) 2 =&--a ,The 13 in. 13 Z. and function 3 is unknown and each term is assumed positive. Thus he has a, 2 = 3 f kk mkfZ. 2 [P (Z g J 2 106 and by taking absolute values and then expectations EI . J = estimating m gets E(Iv.I) -P (f(Z.)). g J He then suggests J by regressing on the values Jin R. E. Parks [34] suggests a similar pro[f(Z.)] J w. cedure where he assumes a. 2==S a 2 Zj e. j,, taking J 2 2 He + yanZ- + w.. = Zna logarithms he gets Ana. J J J then uses ^~ s. Jn 2 ^' knil regression on a. to replace 2 2 and performs a £na to estimate 2 and y We find that for our analysis it is more convenient that is to use the model resulting in (3), 1 . = v. (Z. F) 2, We then get (3), with 2 E(v. ) = a 2 = 1. E(v.) = 0. a.2 = Z zi3 0 . ' The apparent problem with all of these procedures is, as Glejser observes, that the estimated coefficients are biased. Glejser optimistically states that we should ignore the bias effect in the hope that it will generally be unimportant compared to other contributing terms. Returning to our model for the variances, we observe that: 2 -a 2 2 2 + 2 = ) and that E(e. -a. ) = 0. = Z.r + (E. -_ E. (4) 3 J 3 0 3 2 Therefore, if we could only observe e. , the OLS 2 would yield a consistent regression of Z. on e. J 3 107 estimator for 2 E2 - . 2 r%. Of course, the error term while having' zero expectation, need not have constant variance. So it would appear that we have again returned to the problem of hypothesis Now, testing in the presence of heteroskedasticity. however, we are in the position of using White.ts procedure to generate a consistent estimator of the variance covariance matrix and are able to correctly perform asymptotic x2 tests on hypothesis concerning restrictions onthe parameter vector FO. Unfortu2 it is the substance nately we cannot observe E. , of this paper that we can replace e. 2 E. by J1 2 and JFI that in so doing we will get estimation and statistics that are asymptotically equivalent to those when we used E. . In the next section we formally state our principle result, proofs are given in Chapter IV. Before doing this, however, it is necessary to introduce additional notation. We also state without proof some elementary propositions on convergence in probability and almost sure convergence. In the next sec- tion we shall make frequent reference to convergence in norms. Thus for real valued, variables is the absolute value: if x is a JZxl xi, x. vector in 108 or a JR lxX vector in standard Euclidean norm. the norm is the IR Since norm is the same as the dual norm. we use the norm in R nxk in X yeIR For R X , this a matrix L (R k, JRn), that is Since on a finite n. 11 XY Sup k n= L (JR k J)*= (R iYfl =1 dimensional Banach Space all norms induce the same topology, if then |jxnx 0 is a sequence of matrices in Xn | -- 0, R kxj if'and only if L(]Rk YRj ) - X0 | Xn (ij) in -|| 0. - That is we get convergence (ij) the operator norm of jIR) iff L(IR k we have con- vergence for each matrix entry and hence if we have convergence in the Euclidean (Hilbert-Schmidt) norm. We are now in a position to list our assumptions and to prove our results. For the convenience of the reader, we have followed much of the notation of White [59]. We have also borrowed liberally from him on the wording of our assumption. Al) The model is known to be Y= + ) =E(c 2 E(e. ) = 0 a= 2 Z r0 i = i,2,...,n i = 1,2,...,n 109 Where and Ei is a X. 1. Y is a lxk vector of random variables, kxl Y. vector of real numbers. observable, e. Z. 80 is unobservable and is a 1xm and 1 estimated or hypothesis concerning tested. %0 are real valued random variables, 0 X. 1 are is to be are to be vector of real valued random variables which may contain some or all of the variables in the vector r X.. is a mxl unknown vec- tor of real numbers which is to be estimated or hypo- r0 thesis concerning Let W. are to be tested. be the vector of length variables whose first entry W i p of random is the scalar 1 and whose other entries are exactly those random variables that appear in E(W. .W.r.) 1) = 0 lrT We let y 1 < X.1 j,k1 . denote or < p Z.. 1 and - a1 2 We assume that E(W.T 1(. The vectors .2)) 1i = 0. (W.,e.) 1 1 are assumed to be a sequence of independent though not necessarily identically distributed random vectors. A2) I) There exists such that for all 0 < 6 < 1 A < co i: a) E(Ier W. W.is W.W. it lv b) E(E.W.kW.W. WitW. c) Wikir isW it Wiiv E(1W ij ikirs 1 and lr ) < A 1 < r.,s,t,v < p p )+ < A l<k,r,s,t,v < p 11+6 )< _ 1<5 ,k,r,s,t.,v <p 110 d) E(1p. e) E(JE II) Let i 1+6) < A - 3W W Wi 'urisit1 1+6 )< < ~ A Ma =n 1E(XTX.i) nn~ T n M b -1I 1= < p and let E(Z.TZ.) ii We assume that there exists NO < O 0 < X and n > No, the minimum eigenvalues of Ma n b exceeds X and the minimum eigenValue of M exceeds A; n such that for (Note by the first part of A2) this is equivalent to the property that for n sufficiently large det Ma and n det Mb is bounded away from zero. Also, observe n that we can choose A3) 6 and A Let Va = n1 En E(x 2X) n i=1ii1 let Vb n = n1 Eni=l E(y i ZZ.) i We assume that there exists that for so that they are equal. n > N , 0' and N0 < O and minimum eigenvalues of A > 0 such Va exceed A and the minimum eigenvalue of V exceeds A. n n (There is no loss in generality in assuming that NOA is A2 and NOA in A3 are the same). In the presence of A2, A3 is the equivalent of the assumption that for n 111 sufficiently large minimum (det Va, det V ) .R %n is bounded away from and above zero. The first theorem is a restatement of a result ] and its proof can be found therein. found in White [ Before stating Theorem 3. we introduce additional notations. (XTX) Let 6 Let if (XTX) is nonsingular 0 if (XTX) is singular T-l T 2 (ZT Z) ZT if (Z Z) is nonsingular if (ZTZ) lXTY = an 2 is the (Ci ) T n 0 C2 . nx column vector whose is singular i th entry is . Let i= - E . in Xin 1 2 Let pin = - Zan E Let ^a =n n Let Vb n n Let Ra i=1 in n n = i= nin be a qxk full row rank and let . T ^2 y2 ' i i T i i matrix of real numbers with ra be a qxl vector of real numbers. Let R be a row rank and let qxm r matrix of real numbers of full be a qxl vector of real numbers. 112 THEOREM 4.1. (White) Under Assumptions Al, A2, and A3, we have the following: i) ii) 0 n an - 11)/ni[ a n XTX -1 n a n -12 n n )1 E. 2 ^b Z - ra 1(a, a an A n r T- are not observable, statistics associated with N(0,I k) R aSO H0 X TX)aT 0) , [R ( ) under hypothesis RaO R 1] Ho: (R-r n the V n n( n T Since X n under the hypothesis ra T[a n(ra -vi) r) n1 a.e. XX - 1 ^ a n Vn ... v) r0 (a.e,. almost everywhere r r A an 2 q q 2 and the are not computable. It is a principle result of this paper, that we can ^2 E replace by Lin and obtain asymptotically equivalent results. This notion is more carefully stated and then proved in the next theorem. 113 Before stating this theorem it is once again necessary to introduce additional notations. 2 Let entry column vector whose nxl be the n ith 2 is i Let r T (Z Z) = n -1 T^ 2 Z T if Z Z is nonsingu lar. otherwise. 0 A Let W. in = A 2- - Sc i n n n 2 i=l Theorem 4.2. and z.r . in T in i i Under assumption Al, A2, and A3, the following hold: i) n + 0 a.s. Tc -1 iii) T _1 vn [n ii) ^ 0 2(n under the hypothesis A m N(0,Im) H0: RI'0 = r (where R,r are as in Theorem 3. . n(Rn -r) (Note: T [R( ZTZ )-1 c n ZTZ -1 T -l Zn) R] 2 (Rnr),Xq A part of the statement of this theorem is that matrices whose inverses must be taken well as n + w be nonsingular almost everywhere]. - 114 Thus we have shown that we have a valid asymptotic test for testing linear restrictions on the parameters in the variance model. Our next and last result is to show that under additional assumptions, we can use our estimates from the variance model to reestimate the original model and obtain a estimator that is asymptotically equivalent to weighted least squares with variances known. A4) There exists x > 0 such that for all i a.2 > 1 AS) For all 2 i, E(.I|W. ,..., 2 i E(EsIW .,... ,W.) = W. ,... 0,E(s = ) W 1P zi 0r . A6) There exists M < >, such that for all i || Z.|| < M. Theorem 4.3. Let matrix whose (i,i) denote the nxn Z.r . i n Bn Let (XTQ n 0n denote the entry is 1 matrix whose nxn 2 (i,i) diagonal and let entry is denote the Aitken estimator given by X)1 XTl Y where n XTQ n X is nonsingular B0 n 0 n if X TQG n X is singular. 115 Let B be the weighted least squares estimator n given by (T"0Q 11 B (XT n 19 . Tf 1 is nonsingular X n 0 X Qn if X is singular. Under Assumption Al-A6, i) ii) I| Bn ~ 0 +1 p lim v(Bn - B) V/i(n~ XTQ iii) a.s. 0 If R is a 1X) qxk full row rank and 0 = n ~ and 0 N(0,Ik) matrix of real numbers with r is a real numbers, then under qxl H 0: R 0 vector of r XT^ T XQn -lX -1 T-1 A 2 n(RBn - r)[R( n - R] (RB n-r) Xq CHAPTER IV MATHEMATICAL PROOFS 1. Introduction. This chapter contains the complete statement and proofs of the theorems developed in Chapters II and III. Because of its mathematical nature, the chapter is designed to stand apart from the rest of the dissertation and may be passed over by those with low mathemtical inclination without loosing the content of the rest of the dissertation. The proofs in this chapter, especially those of theorems appearing in Chapter III may be of interest not only because they yield further insight into the contents of the theorems, but also because they are examples of the application of elementary functional analytic and Banach algebraic techniques to statistical analysis. The norms used in showing convergence in the lemmas and theorems in Chapter III are the operator norms. Since any two Hausedorff topological vector spaces over the same scalar field and of the same finite dimension are isomorphic as topological vector spaces, convergence of a sequence in one norm implies convergence in all norms. Definitions and notation, where not explicitly 116 117 restated in this chapter, are taken from Chapters II or III where they are first introduced. 2. Market Search. Al. For each in y F, p in Rn, there is a choice in > py. Yfpyf(p) each of For each f, p IRN in X (p) p -* fF Rn and for each f p(f) (Xh(p) and in yf(p) and for each h yf(p) yf(p) H,. can be is continuous. p(f) > 0 with in f such that for all Furthermore choosen such that the max A2. p > 0 for there is a choice such that Xh) < - [ d(h,k) p(k) yk(p(k)) kcF Uh(X (p)) > sup{Uh(x): for all x c IRn where Z p(f)(x-n) < d(hk) p(k) yk(p(k)). Furthermore, f kEFf xh(p) can be choosen such that the map p -+ xh(p) is continuous for each h in H and f in F. Let F. C be an arbitrary function from For each firm p(f) > 0 f for each Zf(P) = and each f, p in H G0 ]Rn f (X(p)-h p > 0 Ei= p(i) = 1}. with let - Yf(p). S Let be n h:c(h)=f the unit Simplex in into Rn, For that is p 0 f F Sn: = Sn, p R n we'have: 118 Ef PEf) Zf(P) =Ef p(f) (Xf ~ n) h:c(h)=f Ef p(f) If Yf(p). = f, c(h) p(f) = p(c(h)), then so by A2, h:c(h) =f p(f) (Xh(P)-Xn d(h,k) p(k) Yk(p(k)) :c(h)=f + I r kEF d(h,k) p(k) Yk(p(k)) p(k) Yk (P(x)) hEH kcF kEF fEF p (f) Yf(P(f)). f EF A3. For no Zf(p) (i) A4. > 0 For each p in Q Sn is it the case that f implies that p(f)(i) = 1. h in Uh : R n' H, R is a continu- ous function. THEOREM 2.1. Under assumptions Al, A2, A 3 and A4, there exists a search equilibrium with Proof. Let K = (D hEH,fsF S (G S , C* = C. then K n feF is a compact convex subset of the finite Cartesian product of copies of JR . We identify an element 119 of K and by the pair has in for each Sn (p*,q*) p(h,f) h in is H function We seek a continuous F. in is q(j) where (p,q), as a fixed point only if in and S n f,j ip:K+K th at (p*,q*) is a price profile for a competitive search equilibrium. Let 1 n +R n 4:Sn x 2 n) where be given by xRn p3:S IR is defined as follows: 1p1 ) (a 1\J) $ (pi,p' Y ''' - 1n, a ,a2,...,a n) J+$ Here and a) =(-$VO), (p,a) = (P ,...,pn, aVb = max(a,b). -Pj+1)(a +\10) a1 ,...,an), The function $p = j (pa), has the following properties: a) $ is continuous. b) For.all c) $(fa) > 0 with d) pi (p,a) in Sn x Rn , (p,a) > 0. if and only if there is an index 1 and a. > 0. There exists at most one index j (p,a) / 0. j with i 120 X be the function from Let defined by = (aVO)/(1 , (a) into IR IR Then >, is + (aVO)). [0,1) and only if if 0 and p c a < 0. Sn, 0 For let and h f (p) X(a) in equals f H, is X(a) continuous function with the property that in the half open interval a in F be defined by hEH,f F ( Xh (p) =f $ The function $(p,q) and = ( $2 :K Uh[X X{U h[Xh(P (h,-))] X 9 -* S . (p(h, -))] . can now be defin ed as follows 2 (pq)) where p,q) (h) )1:K -+ hD Sn he-H .fcF is defined by The function fEF } 1 (p,q)(h,f) p(h,f) + q if f = c(h) hf(p)- p(hc(h))+[1 -A (fp(h, if q(f) f) f A c(h) + $(q(f),Zf(q)) 2 1 + i Clearly K * is $(q(f) Zf(q)) [i] a continuous function from and so by Brouwer's fixed point vex subsets of IRn 4 K into for compact con- has a fixed point (p*,q*). 121 Let and let C* = C. Since = p*, $ 1(p*,q*) p*(h,c(h)) = q*(c(h)). p*(h,f) = p*(h,f) = A2, < p*(h,c(h)). Since i In this case j or * = by assumption with Zf.(q*)(i) > 0 $ (g* (f), Z f(q*) ) > 0, ej(q*(f),Zf(q)) with + $ > 0 (q*(f) ,Zf (q*)) > q* (f)(j) 1 + $d (q*(f), Ef(q*)) This contradicts the fact that for = 0 q*(f)(j) / 1, q*(f) (j) point (p *) $ 2(p*,q*) Otherwise, and an so there is an index and since Xh either f. f q(f)(i) / 1. and then since In either event, for each we can find a follows directly that f / c(h), p*(p*,q*)(h,f), 0 it If U(xh(p*)) < U(X c(h) (p*)). Zf(q*) y* = yf(g*(f)), let c(h) p x* = 11, and so Zf(q*) (p*,q*) < 0. consumption allocation vector by X*(h) = y* defined by defined by equilibrium. a fixed The price profile (p*,q*), XA, is defined X*, the production allocation vector y*(f) = C = y* and the choice C* is then a competitive search 122 COROLLARY 2.2 In a search equilibrium two different firms may post different prices for identical commodities. In a simple general equilibrium model PROOF. such as that described in Chapter 2 of Arrow and Hahn [ ], we see that the equilibrium price is not indexh pendent of initial endowments Consider two simple general equilibrium models identical except and the resulting equi- xn for initial endowments These models are identified 1* 1 1 (H,F,Xnp*) and (H ,F ,Xh'p )' librium price vectors. by the parameters where and F F are both singletons. H U H1 the search model with firms and choice function if h is in H and y*, y*, householders, defined by c(h) = f1 The consumption vectors vectors C Now consider x* Xht Xh x2 if h F U F c(h) is in = f H and production from the simple model will also be the desired equilibrium commodities in the search model. We assign ownership of the two firms as follows: d(h,f) = S(h,f) . 0 if h e H, f c F elsewhere. or h e H1 f e F 123 Here is the assignment of ownership in the S(h,f) we have an ordering on p > q defined by Sn H C H1 in h For each household original model. if and only if Sup{U(X): X e IRn p(X - Xh) < such that h * < d(hf)p*yf + 1 1* d(hf )p ly * } f Sup{U(X):X e IRn n is greater that dh 1~ < d(hf)p*y* + d(hf 1)p q(X - Xh) For h and for p in h such that choose H in in choose H Ph Sn 1* * y} f such that in such that S No p h' Now define a price profile by p* if h e H, f c F if h c H, f c F1 if h E H1 if h C HI, f if f 6 F if f 6 F p(h,f) = p and p* g(f) = p 1* p* > ph 1 f e F £ F 124 (p,q) xh, Yf C Then it is easy to see that a competitive search equilibrium, but 3. q* (f) comprise /q*(fi). The Housing Search Model. Lemma 3. 1. For ln(fX-1) + nx>l nx 1 > 0 nx-1 Proof. 1) Let g (z) 2) g' (z) = ln( z+J. ) + 1z _ z 23 -z (z+l) z (z+j) 1 z2 1 (z+1)2 _ z 1 z _ -1 z2 (z+1) < 0 (z > 0) 3) lim g(z) = 0 z+=4) therefore for z > 0 g(z) > 0 (g decreased down to zero) for nx>l 1n nx-1) nx + > 0 nx-1- = g (nx-1) Lemma 3. 2 For nx>2,1 <nx in nx-1< < 1.4 Proof. 1) Let g (z) = z in 2) g' (z) n( z)z1 = z z-1 1 - z-1 , therefore 125 3) g"(z) = 1 1 z-l)2 ~ (z) (z-1) is increasing g' (z) 4) = 0, g'(z) im g'(z) 9" (z) > 0 < 0 for 5) g is a decreasing function for z > 2 6) 7) g(z) z -z-l z in lim z-+CO = lim - - lim i ln C--o eC - ln(1 L'Hopital's > 1 z > 1 and for = in 4 < 1.4 nx in nx-1 = g(nx) nx>2 8) By < g(2) z < g(2) 1 1 = < 1.4 lim iln C-4' e lim g(z) rule, lim - ln(1-c) --1 =lim - 1~E 1 c:*o 9) +1 g' (z) < 0 so g is decreasing for z > 2,g(z) > 1 Lemma 3.3 Let Yi,Y 2 .-- Yp be a nondecreasing sequence of nonnegative numbers. Let Ml(-),M2(-),-..Mp(-) be a sequence of positive differentiable functions satisfying p Z p YrMr (x). aLet f(x) = r=1 r=1 Mr (x) If xl > xo and d Mr(x) Mr-1 dx (X) Xo < x < xi < 0 -- r=2,3,...p 126 then f(xi) < f(xo). Yp > Yj and The inequality is strict if d M 2 (X) ddM X0 < x < xi (x) 1 < 0 dx Proof. Mr(xl) < Mr(xo) for r < p-l, 1) If < If Mr+l(xo) - contradicting not, then 2) M1(x Mr+1(Xl) > Mr+1(Xo) Mr(X1) Mr (xo) d Mr+1 x)W Mr (X) < 0 dx 1 then Mr+l(Xl) xo < x < x1 - iot Mr (xl) < qr (xo) for ) > M 1 (xO); If r = 1,2,...p p p Zr=l Mr(xo) = l contradicts Zr=1 Mr(Xl) < p Zr=1 Mr (Xl) 3) If = Mr(Xl) holds trivially. for some r, 1 r = 1,2,...p = Mr(Xo) then lemma 3 Therefore we may assume Mr(Xl) and for some r Mr (xl) < Mr (Xo)- d Mr(Xo) Let j be the least integer such that Mj(xi) < Mj(xo) p 4) 5) r=l[Mr (Xl) - Mr(xo) 1-1 = 0 = j-1 p Lr=1 Mr(xl)-Mr(xo) + Zr=j Mr(Xl)-Mr(xo) = 0 j-l~ z r=1 Mr(xl)-Mr(xo) p ~ r=j Mr(Xo) Mr(xi). (all terms in both summands are positive) 127 j-1 6) Z r=1 Yj-1 [Mr (xl) -Mr (xo) p r=j .Yj < [Mr (xo) -Mr (Xl) j-1 7) Z r=1 since for r < Yr 8) p Zr=j Yr r [Mr(xl)-Mr(xo)1 i > Yj; j [Mr(xo)-Mr(xl)] 1 Yr < Yj-1 and for r > j - to get rearrange p p zr=1 YrMr(Xl) .< r=1 YrMr(Xo)- 9) If d M 2 (X) Mi(x) MI()<< 0 < Xo x < Kl QX > Ml(xo). then Ml(xl) (If Mi(x) Mi(xo) = M 2 (xo) then since M 2 (x) M(Xo) MJ(x) M 2 (xl) < M 2 (xo) and by (1) Mr(xl) <-.Mr(xo) r=2,3,...p. This contradicts 1 =r= r=1 Mr(xl) rXi p r=l' Mr (xo)) 10) dM2( W If d Mi(x) dx then X0 < x < xi Mg(xj) < Mg(xo). Therefore examining (7) we see M 2 x) if d MI(X) dx < 0 Xo : x ~~Xl p p zr=1 YrMr (Xl) < Zr=l YrMr (Xo) and Yp > Yi 128 PROPOSITION 3.4. Proposition 4.1.) (Chapter II, mBj (Yi 'jP(j,K) lk=1 = mBj (Yi , j) I k=1 n k ( ) -1 mBj (Yi I mBj(Yij)-k. 1 k-l n( n)V n k-1 n-1 m [Hj(Yi,j)-Bj(Yi , j) I] (-) n K ( mBj (Yi , j) k k ) mBj (Yi k=1 . /n 1 m(Hj(Yij)-Bj(Yi, 11 mBj (Yi, j) 2) Binomial n-1 mBj(Yij)-k n N N theorem states Zko k 4) Z N k=1 N pk (1 -p) N-k N-k 1 [(P+(1-P)] N= 3) pk (1-p) ) -1-N k mBj-(Yij) Pjk 4)k=1 1 mBj (Y, 1 mBj(Yi, SmB (Yi,j) m[Hj(Yi,j)-Bj(Yi,j)] P- j) [m[Hj(Yij)-Bj(Yij)] ) (1 = nn ) mHj(Yi,j)I I 129 Propositions above are proved in the body of the text. Theorem (x,e) (Chapter II, Theorem 4.5.) 3.5. 1) Let f 2) P(x,i) = ZjEAi U(x) xm = then , 1 rnBj (Y , j) [ f (XH (Yi ,j) -Bj (Yi, j)) f(X,Hj (Yi,j) I. 3) If Ai /d .9, Then af < 0 if (x, i) > 0 x, 6) 6E[Hj (Yij)-Bj (Yi,j) and aaxaf e (x,i) < 0 if Hj (Yij)] > 0 (x,6) 6C [Hj (Yi j)-Bj (Yi,) , Hj (Yi rj)] 4) ax a2 f axa e u(x)xme me[ln(n nx [in nll) nx + 1 + 1nxlm 5) a f 6) By lemma 1, sign a f axa e nx 1) + nx-1 [ (x) 40+ iy(x) xm ln u (x)]. y1) xme [1 + exm in yI(x)] 2 7 i= = 7) ignaxae sign [1 + 6xm lny(x)]. gf sign [1 + e~a nxnxln(nx-1 nCnx 130 8) 9) for 0 < 6 < 1 (6 1 Oanx + = Hj(Yi,j)-Bj(Yi,j) e = Hi(Yi > 1 + a nx In(nx-1 nnx-1 1 + anx ln( nx-1 nx or > 0 if a < - nx-1 nx In (nxnx 1 nx In( nx nx-l 10) nx for nx>2, nx In( nx-1 By lemma 2, > 0 2 e aa1) 5/7 if nx>2 aP (X, i) < -nx-1 nx In ( if a< 5/7 < - < 0 ax < 1.4, 1 + eanx in nx-1 nx (13) If (14) If < a sup = 1 -6 anx In 1 + 6 a nx In nx-1 n1 nx-1 < 0, {[Hj(Yi,j)-Bj(Yi,j)] -1 [1 + 6anx In nx-1 nx > 0. - there if < a < 0 (6E:[Hj(Yi,j)-Bj(Yi,j),Hj(Yi,j)] ap (xti) x 1nx-1 JcAi sign and nx In (nx nx nx>2 Ai / (12) therefore and ea )) 131 THEOREM 3.6. (Chapter II, Theorem 4.6) 1) Q(x,j) = 1-p()x)mHj (xj) 2) 2 (x,j) = - p(X)xmHj(xj) By lemma 1, 3) If and 4) Let nx>l if i m nx n(nx-1 nx nx>l, Yi + 3Q ax Hj(xj) = E(x,j) f(x) mHj(xj) [ln( nx-1) 1 0; nx-1 + 1 nx-1 1 so < 0. (xj) = ) [mHj(Yi,j)-mBj(Yi,j)] xmHj (Yi, j)I u x) xmHj (xj) 1 - 5) Partition Cj 11x) 1 2 into k disjoint subsets C., C9, -J J k J such that: 6) a) i E C b) i E C t e C J Y. J 1,] t C Cs+ 1 J J . = Y y. . < y iJ Let is be chosen such that i E C 5 t,j J t,j s=1 ,2,...k, then 1lCs li = mBj(Yi 5 j) k 7) E(x,j) = f(x) s=1 III Y. 1j (Yis, j)-mBj (Yi5 x) [mH xmfHj (xj) 1 - 11 (x) j)_ j (X) xmHj(Yi5 ,j) 132 rnHj (xj)-mHj(Yi Sj)+mBj(Yisj) 8) E e= mHj (xj) -m~j (Yi sj) (X) EIli xmHj + 1 (x)-e xmHi-j (x) - Wi (x -e+iJ xmHj (xj) 2.-p X x[mHj(Yisj)-MBj(Yis pj Wx , j) I xmHj(Yisj) -v()W xmHj (xj) ) (3.-PW 9) m~~j-~(~sj+~.Yi~)Mjx)MjYsj (s=1,2, .. .k-1) mHj(xj) mjYlj = mHjCYik,j) 10) mBj (Yik, j) = mHj (xj) E(x,j)-= f(x) r=l x[mHj (xj)-r (x) x[mHj (xj)-r+l1)] -Vi Wx xmhHj (xj) 1 where Yr = - p W) yi srj if m~~j-~(if)lrmj~j-~ 11) f (X) nHj (xj) r=l (i~)mjYsj Yr mr(XW x [mHj (xj) -r]) where Mr(x) J x[mHj(xj)-r+l - li Wx =-W(X) 1 xmHj (xj) (X) 133 x[mHj (xj)-r+1] x[mHj(xj)-rl 12) v (x) Mr (x) Mr-1(x) 1(x) -W (x) x[mHj(xj)-r+2) x [mHj (xj) -r+1] -i (x) [r=2,...mHj(xj)] (x) d Mr (x) Mr-l(x) 13) 1 _ xl (x)2x dx, nx-1 nx, + 1 ] nx-1 d Mr (x) Mr- (x) < 0 for 2<nx and r=2,3.. .mHj(xj) dx 14) mHj (xj) Mr(x) = 1, so apply Ir=1 if x 1 > xo (xo > 2/n) then and then exist i and i' f(xj) 1emma 3.3 to get f(xi < f(xo) with Yi,j > if xo > 2/n Yi, r xj then < f(xo). rX Theorem 2 4. Block Scalar Variance Covariance Matrix. PROPOSITION 4.1. observed data matrix Let X (a1 1P*. 1J) and dependent vector ing Assumptions I-IV: 1) Let E = a2 101 1 0 0 0 2 0 0 2 01,2 2 0 2 Gi Ij FR Y $for satisfy- 134 = T X)-1 XT 2) Let B 3) Let En+1 = E(B n) 4) Let Bn 1 1 1Y. ( T -1 X) ~ XT 1 Y. Then A) For each B) The sequence C) If B* n > 1 exists; B {Bn I has at least one limit point; is a limit point of the sequence B* = (XT E(B*)1 X)~ PROOF. {B i then XTI(B*)~1Y. Our proof is similar to that done by Oberhofer and Kmenta. 1) Let f:IRk " -+R f (B, Z 2 .. be defined by N/2 Z. Z 2) = ( 2 7) exp - 1/2 E 1 J= [HjJ -1 (Z i 2)mj -1/2 (Y.-X.B) T (Y.-X.B) 3 3 z. 2 J (N =. f mi ) is, of course, the likelihood function should the ii be normally distributed. 2) The concentrated likelihood function is defined by h(N) 3) Since = (2Tre)-N/2 lim 1B I + 3 sup M.- [ (Y,-X.B) (Y.-X .B) 3 3 3J 3 m. J 2 f(B,2 I =J 1/2 3 2 3..z J lim B1 h(B) = 0. -|oo It follows that for any 65 > 0 135 {Be]Rk 13 Z (Z2... f(B, Z 2 ,. c ]+Ji 2 Z. 2) > 6 . J) is either bounded or empty and hence has compact closure. 4) It also follows from 3 that the function 5) It is immediate that for any f(B,Z 1 (B),. .. ,Z (B)) (Z 2,. .. ,Z ) for all It is also immediate that the unique ,Z 12 Z1 2 where 2) numbers is given b)y B Z1I1 0 o 222 Z$I 0 0 0 Z I J 3 2 Let . a 2 2 I. is XT -1lX {B n} be any = 2 all j. J that maximizes are f ixed positive m. X m. where identity matrix. exists by induction on J 2 n. positive numbers. T X TX The matrix B T -l (XT X-1X) -1 '(X E Y) = We show that the sequence 7) IRk 2) => Z.(B) = J J f(B,Z 2 1J f( is bounded. and that f(B,Z 1 (B),. .. ,Z.(B)) 6) £ > f(B,Z 2,. .. ,Z ) c IR+ J B f = jl a m. 1 3 XT . M.X For each is a positive definite symmetric matrix J m. and 32 1 ,j is a positive number. is Therefore, a positive definite symmetric matrix and, in particular, is nonsingular. XT-lY Therefore, exists. B which equals (XT -1 X)-1 136 Assume infk BeJ || B1 ,... ,Bn so that for each J J In the above argument which shows 1 8) By assumption Y.-X.B| > 0, 2 0 exist. by to see that Z (Bn) j, Z.(B ) > 0. B1 exists, replace exists. [X] A. {B n} is It follows from 5) and 6) that f (Bn' 1 f(Bn+1, (Bn)'... Z(B n)) < f (B n+1'Z1 (B n+1 'l''' .. Zj (Bn+l)) hence from 3) bounded and therefore has a convergent subsequence. [X] 9) Let B* be one limit point of B f(Bn . and let {B i +B* , Z 1 (Bn ) , . . . , Z (Bn k)) f (Bnk ), .. Z (Bn k+1 1 k+1 B. < f(Bn+1 Z (Bnk ),. . . ,z (Bnk )) ,Z (Bn < f(Bnk+1, Z1 (Bnk+l , .. ),(Bnk+l Since: i) ii) f is a bounded function. B = (XT (Bn ) XT E(Bnk X) so k Bnk +1 converges to which we denote by Letting function k increase to f and Z o (XT (B*) ~ x. yT E(* B*. and using the fact that the a are continuous, we have 137 f(B*,Z (B*),...,Z (B*)) < f(B*,Z the unique element of f(,Z 1 (B*),.. .,Z IRk (B*)), (B*),...,Zj(B*)) is B* that maximizes s* = B*. therefore [X) C. Let PROPOSITION 4.2. g(B) = Let F g:IRk -1 XTE( -1 -1iT (XTZ(B) X) X (B) Y be defined by IRk 1 T= -1 (1 XT -1Y] XTE(B) Y)]. XT Z(B)X) be the set of fixed points of d = Sup{l B-B 0 jfl d < y0 tions I-V, B e F}. g, y0 > 0, Let almost surely as N + oo. appear explicitly in the definition of fore, for each remembered that Recall that N = vector. X j=l m. j (X,Y) is an X, Y, and N all and that,thereIt should be and that, in effect, m.=m.(N). 3 let B be a fixed pointof real matrix and N x K 3 Y is an Where no confusion is likely to arise, we drop the superscript. g(B) = 1) g, is a random variable. N,d For a fixed pair N x 1 then under Assump- It should be understood that Proof. g. and let Thus: X (B)~ Y) N X B)Z B. Recall that: Y = XB 0 2) and rearrange 1) 3) + E to get: T 1 [(X T E(B) 1l X) (B 0 -B) + X N Al -(B)E] = 0 SpT Now premultiply both sides of 3) by (Bo- B) get: and expand to 138 .- T XXA - 3 y 4) Let = { I m S 2 = {j bounded } . is $ M. - -i-v N + =} is unbounded as M (N) (N) o Z. ($) S is the empty set. cannot be empty S2 Z'J = 0 We therefore get from equatioi 4): r r (B M. 5) jeS o -B) jES 2 m. A ^rX-.. (B -B) + o (B -B) om Z. (B) L -7 M. 4X A A N (B-_B)-0 X2 Mn. -B ( (B-_B) o X.E + I.. (B-sB) Mn. o N We now analyze the RES of equation 5) and examine each term in the summand to get: XX 6) M. n [(B 0 M . .X (B A and let By convention, we take however, may be empty. where o L S (B + (B -B) M. f (B -9) m AT X.e. -3 1 -B) -$) + A nr (B--) (B) J (B O-$) ] I0 - 139 X .E. N (^ Z6.B) Li4 XZji( j CS and e. , and for N then are independent of Z ($ ) N . Hence, for N inf BF]RK m (N) Z.(B) 3 4a , X is sufficiently large, and so is bounded away from zero. For lim N+)00 N ax 2 -bx >- sufficiently large, 3 also independent of b>O A 4 N For (From elementary calculus if a>0 12 1 m. (N) N Let j C S2 , m (N) is bounded, therefore, for j ES2 =0. y > 0 it follows immediately from the above that: , that: ,r (B m. (N) es2 8 j F-S + - (B -B") 0 -a-M. >- N almost surely as N + = By 5) and 7) we now have that 8) o mN) 3N) 3N3 ( T for arbitrary y > 0 (B-B 0 ) + (B 0 -B (N) J z . (B N Y 140 almost surely as N + w . Before proceeding with our proof of consistency, we needrteoestablish 9) Lemma 4.3. on t'vo elementary numericail lemmas. Let a, b, c, be positive real numbers, then {x I x > 0}, {i the function f(x) functiobx+c ax = is an increasing function. Proof. 10) Lemma 4.4. ac (b x4-c) > 0 Let a,b,c,d, be positive real numbers. f(x) = Let = f'(x) ax-bx cx2+2bx+d x > Then for - a-- f (x) 2ab 2 4 (c+a)b 2 +a 2 d a b> Proof .a Now use lemma - 2b a 2 2 cx 2 +2bx+d 2 2_ -bx - __ (., +a) x 2 +d 4.3 to get: 2 cx ax bX +2bx+d 2 2ab 2 . 4b 2 c+a)a2 + d 4(c+a)b 2 +a 2 d > 141 We now analyse the 2 LHS JIB 0 -B"1t> For of equation 5) m + J (B 0 -B) J ) (B 0 -B mn -T Xme + (B -B) mn. ll) ($") Z . XI B -B 2K 0. IB O-BI - I I E. X.T C. 2 + 21B o -B I~- T| B -B Z. (B (Recall that = (B + 2(B.-B 0 For + (BO-B) 3 use 10) + mn to get: 3 IIB 0 -B II A, 4 T >2 J. X. (B 0 -B) 12) -B) -7- XJ m m - m. M. -r ( (BO-B + (B 0 -B ) x 2X 2 M. J x7 E. 4(T+>) 2 3 + X2 T S3 mnj (B ) Z x. . I C. 142 if < 3 and if , < V then using 10), 12) J becomes: |1 JIB -B For > 6 13) (N T (B0 -B) X X. 3 3 (B 0 -B ) + (B 0 -BN T XI 3 3 "N Z . (B) J N 8 (T+-X) 6 2 + 2V m. (N) T For all ( 2 1, therefore for N sufficiently j=l m. (N) large N jE S it follows V > a.2 y=1/2 k62 > 1/2 . Let a.2 , then above, let V=1+ j=1 all j . In equation 8) 8 (T+X) 6 2 +2V By the Strong Law of Large Numbers we have T (14) almost surely max je S( N + W and 15) almost surely <2 max j CS N + = |M.(N J < V 143 8) gives us that T A( zjS (B 0 B (Bo-B 5 m. (N) ND N 1 )+ T A ) (B0-B (N Z. (t") 1/2 X6 2 8 (,T+X) 62 + 2V almost surely N + . Now observe that if : M (N) Ij5 X. > 1/2, $N max E. M N j (N) j.S 2 max (N , ES xTr 7-T M. (N) and Zj ES (B -B N ) Xix. I< v (BO-B) + (B 0-B )- AJ) Z. (B 1/2 X6 2 8(T+X)6 2 .+ 2V Then |1B A 0 -B II < We can now conclude that for almost surely as N + = . 6 > 0 IBo- B I < 6 144 COROLLARY 4.5. Let 0 property that for each 2) number 1 ,.. such that for each pair 2 {B n a 1 ,... point of Assumptions I-V, 0 the property for each (X,Y), 2 a1 , 1 ,..a E(X,Y) (X,Y), 2 ,a1 , I-V, the sequence Let N, 2 0 be an estiator with there exists positive such that for each pair Then under Assumptions XY)}. {Bn (1,1...,a convergent almost surely as Proof. is a limit Then under 2,X,Y)}. 1 2 ... O(X,Y) N) is a limit point of 2 {Bn (a perhaps depending upon is strongly consistent. PROPOSITION 4.6., numbers there exists positive N, 2 ., be an estimator with the XY)} 2 is N t o. Consider the mapping g: IRk +Rk defined by: 1) It g(B) = (XT (B)~ X) 1 XT Z Y(B) Y follows immediately from the definition of the proof of Proposition 4.2. that that 0(X,Y) is a fixed point of g Bn+1 = g(B n) g. and and As in the proof of Proposition 4.2, let B = 0(X,Y). exists a subsequence B. converging to Since there B, -the 145 {B ( n1 g is original sequence converge to if B 1 . .a 2 g(B) H1 2) < 1/2 11B - B only show that we need {B n implies that 1/2 < - g(B)l will a contractionmapping near Therefore to show convergence of B. XY)} Y~ ,J 2 1 B - BI) [Here as elsewhere in this proof, all norms refer to To see that this suffices, observe the operate norm. that B. 1 Bn g (B) - -+ B j< B imDlies that there exists 1/2. (B), = g B = g(B) Since an d since G such that = g(g(B)): = +r Pn p g (B P ), we have that: || Bn +r P 3) | Bn (1)2 gr(B)H g ( n ) = ~2) ( 1 r+l p 4) g(B) - g(B) XTE(B) = N XTE (B) X) ( XN TE ()X )-lX) Y) -1. 1 XT Z (B) - Y) Use the following facts: Y 5) 6) 11AB CD 11 < | Al l = XB 0 |[ B - D +A [ - Cl ||1 D 1| 146 1 11A 7) - B_1 1 < 11A - to get from 1 A - BI1 1 B_ 1 1 4): 8) 11g (B) -g(B) ell x (B) - XT 1 X) B XT 1( -1 -1 XxT E (B) 1 - 11 XTE (B) II - g(B) || B - B I 1/2 < 1 1X We wish to show that for |1 B - BII Jg(B) XT Z(B) El) X - X Z (B 1 T + 1 X E(B) T N X-Z(B) B, about -1 XI < 1/2, almost surely as since we have shown I B - BO l1 < 1/2 N + o. almost surely as Let S1 = {jM (N) is unbounded as Let S2 = {j M (N) T is bounded as N + o}. N ± T (Be-B)T - 1 -- (BA -B) m - U -u J Z. (B )= therefore for B0 - B|1 9) 2 T < _ 1, < Z.(B) -- 1 m - m=i min J 0 (a 2 )2 m < M < +3 Jn we have T T m. T _ J T F_j co}. + 2(B 0 -B)T xm. - then 1X)'hI1 We can confine our analysis to the unit ball N + o. Let -I < T + 2 and let and for 3 M3 m + . 3S m. M = T + 2 +- jC S 1and llB-B0 1 147 we have: almost surely as < M M < Z. (B) 10) - J For B such that ( X (B) 11 B - B0 Since X)l. , < N' t 0 We wish to find N XT (B)X is positive definite symmetric, to find a upper bound on the norm of its inverse, we need only take an inverse of a The positive lower bound of its minimum eigenvalue. XT (B) minimum eigenvalue of 11) For ZT(1 x T(B) inf eIRk H Zfl Z_ E R k, N X)Z Therefore by 10) N + 0. ( $(XT X) particular, since N + co, we have: = and for Hence, for X E(B) lx)z Z .XTE (B) - X)Z eigenvalue of is given by =1 M. zT( 12) X) (B) | X) zT x-T X Z X. X. Z 3 m N=1 Z (B) B - B0 < < * M. 3 CS 1 Z .3 the minimum almost surely as JIB - B 0jjz1 almost surely as JIB - B 0 l < 1/2 N t n. almost surely as In 148 1 () N (XT (XzB 13) 2M X) - N t m. almost surely as T 1 T Ngx- E(B) 14) 1 T RX, E(B) - Z (B)) + eS2 Now m. Z (B) = 3 (B0 -B) + 2(B0-B)T M. J3 (B 0 -B) -T x.Tx Jm.3 (B0-B) T (B 0 -B) - 3 Z ( ) XTX Z (B) j2 Z..B M. Z. (B) Z.i(B) X. jcS N T m m. 21 N Z. (B) Z.(B) 21 21 2 2(B0-B)T B X.T M3 X. TE M. 15) < T11 B0 -B I Z. (B) 31 Z(B) |1 B-Bli + T11 B0 -BI1 B-BI| T + 21 < T B -B + |1B 0 T B0 -BII T + +2 x ,. E 2 J Let 0, B - B - _ j c S1 then for and I B - B1 < 1, it follows that: X T 16) -. B) J1 Z. (B) j e S2, X , m (N), independent of (Zj(B) - Z (B)) 1 < 11B-B N t o. almost surely as For m. N. 21 are all eventually By assumption Z (B) > 0 all H 149 B £ j and for IRk e S2, m. (N) 0, lN N = 2N to N therefore fJ B - Bol . : we have for T 17) Ej .1j 11 S2 Z.(B) Z.(B) J1 J YI 11B - B1 Combining 15) - ^ j almost surely as - Z. (B)) I1 21 N + 00. and 16), we get: E(B) I 18) (Z. (B) 21 m 3 - p TE B) Sup 0 < B 6 IR k |B-Bo 1 11 B - B0 11 N t o. almost surely as m. M.XT X.TeE 1XT(B) 1 jES + m. N jES 2 m. X.T 3 3 3 N m. Z. (B) and 1 Z . (BI) therefore for lIB - B 0 l T ,(-B 1 X' (B) £1 19) N < XTZ (B) and -<1 y2 > 0 Hence, almost surely as 2 Y2 E almost surely as N t o. ST 20) X-TE(B) -x XT Z(B) Xj ES N M. 21 Z (B) Z j (B)- T m. X. X (Z.(B) 21 - Z.(B)) + 9L jeS 2 Zj(B) Z. (B) (Z. (B)-Z. (B)) 1 21 150 By 15 and 20) and for 21) I |1 B - B0 1 - 1 SIT -lX XTE(B) 1 X SI XT (IB) XI I N almost surely as < m y 22) 2 [2T+3] B-BI| o. N Combining 8), 13) , 18) , 19) , 21) choices of we get Y2 and and by the proper we get: g(B) Sup k | B 0C1| B-Bj < 1/2 g (B)I - < 1/2 -B N t c almost surely as [X] The next result is a minor improving of the last proposition. It will however pave the way for showing that almost surely as independent of the by g(B) = (1 XTZ(B) Let K(N): 2 (1 PROPOSITION 4.7.. gets large, N 2 Let X) 1 X1 T ( X- Z(B) Sup B || E R B 2 c ]Rk || B - B0 |IIB 2 -B0 ~z1 K 0 < |1 B 1 -B 2 1 selected. -*1R1, g:IRk g (B = our estimator is ) B be defined 1 Y). g(B 2) - B2 1 151 Then K(N) < 1/2 Proof. almost surely as N t m. As in the previous proposition all norms refer to the operator norm. We use the same notation as in the proof of the last proposition. 1) - g(B 1 )-g(B 2 2 )yX - X)1 T (B 2 ) 111XTE (B 2 -1 X ) X XT(B 1) 1 1 1f T(E(B2 -XTX(B ) 1 c 1 XTE(B ) l6 e x(B X .1 _ X) . From the proof of the last proposition, we have already shown that: 2) Sup almost surely as (XT Z(B) 1 B - B f0 1< 2M N + o. Here as elsewhere in this proof, let elements of the unit ball about B1 , B2 be B 0. X.X. Z (B1 ) - Z (B 2 ) = + 2(B 0 - (B0 - B T m.3 (B0 - B1 ) + X.T C M.T B1) T JJ - (B 0 - B 2 )T T ___ inM.J(B 0 - B 2) 23 - 2(B 0 - B2 ) T T . 3 M. J~ 23 152 and so: 3) IZ (B) < 2 1B 2 - B1 | Z (B2) - T x.E. T x. e. + 2 | 4) 2 - B B| > 0, y Let Z j N 5) S 2 11N X., N and NX E(B 2 ) =j J j . 2) Z (B ) Zj -1 B - E m, 3( (B() j(B -- N t o. 1 -1 T B ) 1 (B2) Z(B ) and therefore: B-B are all eventually B2 1 E 1 1 that: almost surely as 1T c. X. 2 (B2 it immediately follows from the m. 5) - B N + t. m (N), independent of 2 it follows that (B1 )-Z e S2 , facts that ]B c Sl, 2) almost surely as For _ <B [3 j then for ) T + X E.: m 3 Z (.Z i(B1 ) ( (B Z 2) B 153 6) III Sup SIB 1 -B0 1I<_ XTI(B E - l < JIB 12 -B_oi< HIB B1 1 x T( B 2 Y B 2 11 1 d B2 almost surely as N + c. Let Y2 > 0, we showed in the proof of the last prop that: 7) III XTr(B)~ 'i Sup IIB-B almost surely as N + w. -2 <1-T o0 m XT (B2) j(B I< ~IX - ))- Z (B 2) XTE(B ) + J Ss 2 X. I 2 X.X.iT N M. 3 m. X. X. N 3 3 1 1 WW ( 2)zW(B 1 Z (B (Z. (B1 ) and so by 3) we get: 8) Sup IIB XTE (B) - I jB 2 B0 1 < 1 X - i B2 x (B1 1- 2T + 3 B1 B 0 11 <.1 B B2 Now by the appropriate choices of y and Y2 we get our desired result that: 9) Sup IB1-B o IB 2 -B_Oi B # B2 IJg(B 1 ) < 1 < 1 JIB - g(B 2 ) 1 - B 2 11 l almost surely as NT - 154 Let = (XT g(B) implies 11 B2 (B)~ - be defined as before by X) 1 (XTE(B)~1Y). || B - B0 | - B0 l B -R k g:IR < 1 < 1 implies that Since, if points we have B both | B1 - B2 11 - || g(Bl) g < 1 g(B 2) and - 1/2 has a unique and g(Bl) 1 = = B g(B) B1 - B0!| and that then it follows that B2 |! fixed point. B2 are fixed - g(B 2) | < 1/2 and thus || B 1 - B 11 = 0. 2 11 B1 - B2 11 PROPOSITION 4.8. Let as in Proposition 4.7. Let Suppose Let J 1 (2Tre)- N/2 h(B) = k IR g:iIR F Y. j=1e - Rk |g(B) = B). T - m. -1/2 X.B) T(Y. - X.B) 3 3 3 mM. = {B be defined £ Then: 1) F 2) There exists a unique is a singleton almost surely as F = {B*} where that maximizes Proof. 1) that maximizes h N + o. almost surely as 3) B N + o. B* h is the unique element of almost surely as N + o. Proposition 4.2. gives us that: supj| B - B0!! < 1 almost surely as Bsf Proposition 4.7. yields that: N + Rk 155 Sup BI-B0 H< I 2) B 2 -B 0 g(Bl) B1 - g - B2 1/2 2 1 almost surely as N + o. Proposition 4.1. shows that F cannot be empty, thus we have: 3) F is a singleton almost surely as N + *. Proposition 4.1. gives us that there exists B E IRk B is that maximize h a fixed point of g. and furthermore any such The rest follows immedi- ately, COROLLARY 4.9. Let the property that for each a1 2 (N),.. . ,a G(X,Y) h(B) N there exist 2 (N) positive numbers such that is a limit point of 2 (N),XY))} {B n(a1,12(N),..., 1) be an estimator with O(X,Y) O(X,Y) then: maximizes the function (2re)-N/ n=1 almost surely as [ m (Y. -X.IB) N t c (Y- -X . B) m -1/2 156 almost surely as 2) E(X,Y) 2 0, (N) almost surely as limit points of g. 2 , the are fixed {Bn( 2 2 {n 01,1 '''' 0 1,J ,X,Y)} Part 1 of the corollary now follows. is a fixed point B1 N +-- 2 a1 ,1 ,-.aig For any choice of Proof. If 2 + is independent of the choices for 2 ca1,1 (N) ,...,j points of N of g, and if Z(B ),then we have if 2 2 '' 1,J ,X,Y) = B 1 for all n. Therefore ,1 '' G = {B*}, where F is the set of fixed points of g, a {B then it follows that for any choice of 2 ,1,. .. 1,1 2 0 , , J2 the sequence 1 1 Therefore whenever F B*. must converge to ,X,Y) is a singleton, pendent-of the choice o T 2 , ,J 0 2 is inde- 157 LEMMA 4.10. jm,.(N) Let aN =X j= N j=l NN Then: Cj 2 aN: = a exists and max aj 2 > aN>min -a A) lim N+ -o B) trace a N -1ZN = N all N C) 1 T lim K X. (aN N + co 1) Part A follows immediately from assumption 6 and from the 2) 2 exists and is positive definite. N)~X fact that aN is a convex combination of the Trace a N N N N aN. trace ZN N.= aN 1 --N J aN 1 3) +). co -lX XT lim N N-+o N 4) lim N + aN 1T N 2 2 = N N N IX N aN exists and by Assumption VII 1 X (a IZ )~'X exists. N N (X (aN -~ 1 X_ X)N aN Thus I (N) N J j=1 , m (N) a~~T max a.2 [I[ (X (aN 1 J2 min a. 2 N)X) j and hence: N aa a. exists; thus 1 (XT . j=1 m (N)a 31 N j=l T N 5) mj(N) L.a2=--2 = a (X N(a(rN)'X NN By part A) lim alN 1 XT (aN lim N N) max a. j min a. j j X is invertible~with the norm 2 +- of the inverse boun ded by 1 j *1 158 LEMMA 4.11. T E {} 2 < KTa. Proof T 2 /Ek 1 i=1 X 2Am1 jik ji k=1 S0 Since E (x(Xik .-.-. i1Xask i / as) -1. V- s=1 7- 2 x E jsk Eis . 2 i = ik we have: 2 X.T k 21 m m. j 2 m i. 1 Xjik k=1j = 1)Ej (where x. jik is the ik element of the matrix .X.). X TX < T, and since is positive semi m ~m. definite symmetric, each element on the diagonal must have values X.T X. J By assumption less than or equal to 2) I m j=l 1 m. T. i2 uk < - Therefore: T 2 a 1. and since xjik are non Since E(z j2 stochastic, we have from 1): T E 2 Xj =kE k=1 a j 2 i -L i=1 m x.2 < K a 2 T jik - The next lemma is an immediate consequence of Chebychev's inequality. j 159 LEMMA 4,12. EYn2 < M < m, Xn + 0 in probability and suppose XNYn + 0 is probability. Let then 6 > 0, C > 0, For any Proof. we have: IYn and Prob{IXn n < 6 ) > Prob {IXn! < 1) c}. I By Chebychev's inequality we have: EBY 2 Prob { 1 n 2) y > 0 Let ently large that can find Hence for' n > No, -2 C} > Prob {Yn > such that N 2 > C2C by 2) we can choose 6 > 0; and let I-- < C} > Prob {IXn < < y/2 + y/2 = y and so Prob{IXn < By hypothesis we c+} > Prob {IXnI we have suffici- C - and - for or n < n>No 'YnI C} > > C} -Y [X] Lemma 4. PROPOSITION 4.13. g(o(X,Y)) property that g(B) = = e(X,Y) Y), -1-1Tr(B) ) (XT( be an estimator with the e(X,Y) Let where g(B) = (XT(B) X) then under the hypothesis a above: A: e is asymptotically equivalent to the weighted least squares estimator with known variances in the sense that if W p lim /N(O-W) = 0. N+ o is asymptotically normally distributed with mean vec- is the latter, B: tor e B0 (XT -1 C: and variance covariance matrix -1 (1XTE - 1 11 -1 _ X lT . 1 -1 -+ 0 -1l almost surely as N + o. 160 Proof. = (XT (aN X is a Let VN N) -1N N x K -1 aN = -1 T W then N, N (X = 1 X) (XT V Y) N X) T wYre -lXT yN- 1Y where matrix. m.(N) Let e(XY) cN (X = zj(e) and let VN N = j=1 VN X A and B -+ V)X N XT ( VN_ plim ii) Principles of Econometrics, it suffices to show N1 plim NT XT vN 1) 0 and = VN- 1 ) e: = 0 O4N N)X 1 1 T ^ NX (VN T - N X N_ )- NZ NZ)X - Therefore: N 1) then * T" N- By theorem 8.4 of Thiel's to prove (6), =N XTN-1 VN)X m (N) X =l = N T X. a a 6( ) Z. m. ai3 m.(N) = Z Recall that m. (N) Z (0) N N (e) j and that for any J, * m. (N) aj2 + 0 - N J almost surely N + oo Hence: 2) 0 almost surely as N + co aN T ^ 1 and we have N X (VN V N~)X = 0 (1 (I) ) XT( N 1 - N~)e = N IT almost surely as T X ( - TaNZ )& and thus we have: 3) (X (VN 1 V-N Im.N E. . XT iN ( N Z (6) aN 161 x TE By lemma 2 E I 2 < KTa. 2 independent of N. '---|j /m.(N) j For e S , 1_ aN aN 31 j 2 0 almost surely and since + a_()__ m.N a < 1, 0 < NN it follows that a 2 t. almost surely for N + If j M < w. such that C S 2 , then since 9 aN aN Z() 3 e m.(N) N N and since M. (N) aN Z(-z ) N. N. 0 as N + c, + aN1 - almost surely as N + .<M j.I ) 0 almost . we have surely as N + m. We are now in position to use lemma 3. to conclude that, T j =1 m m(N - Z.6 N (N) converger to zero in probability and thus: 3) plim 1 xT(VN N + c VN TExT - XT E ~ X - - VN Nm.(N) Xx N N XT ( e - x = so part C follows immediately. )s = 0. NN X.TX. N Z 6)1 3~ 162 5. Linear. Variance Model (Strong Law of large numbers) PROPOSITION 5.1. Let $ IR such that as Let {x n be a positive even and continuous function on increases [xl and tt(x) x + be a sequence of independent random vari- ables with 0 < a n t 0. E(x n) = 0. for each E($(X )) n Proof. x n .EJ=1X 1 n almost every and and let < o then (a a n) if -n n + 0 n converges na a.e. Chung [7 ]. Let PROPOSITION 5.2. {Xn }n>1 be a sequence of independent random variables and suppose for some 1 < p < 2 for all m, < 0 there exists n. 1.1, lim sup Then with EIXnIP X |I < m+2 almost everywhere. Proof. 1) Yn = Xn - E(Xn' Let - 1+ IE(Xn)I < EIX_ i 2) (E JYnIP) EJY nP =fIXi/p = (fIX m E(Xn) - E(X n) + (fIE(Xnp) 1 By Minkowski p) lip < Cf IXn 1 /p /2 (E[Yn Ip1/ 2 < m = 2m + .1 + P, 163 3) let In Proposition an = n. 5.1. let = |x|P $(x) and We then conclude that since EYn np n n 2m+l P n np Y. +0. n in 4) lim sup 5) lim sup 6) lim sup in n -1 < < 2m+1- n np a.e. Yj n n=1 n n 1 X. n=1 3 < - 1n and a.e. j=l E(X.)j 3 < I ae. in. n+OO PROPOSITION 5.3. =1 X.i < 1 + m + 1 = m + 2 Let {Xn n>l a.e. be a sequence of random variables taking values in the same fixed E space {Xnln>l Let (where R or ]Rjxk ). Let have a limiting asymptotic distribution {Y n>1 then {Yn n 1 tion D. Thiel D. be another sequence of random variables taking values in Note: E = RI , E and if p lim ||Xn - n1| = 0, has the limiting asymptotic distribu- [z19] Convergence almost everywhere implies conver- gence in probability. 164 PROPOSITION 5.4. Let H denote an arbitrary Hilbert Space and P(H) c L(H) be the set of bound-d positive linear operators. maps P Then the positive square root function onto (A'* A2 ) P that is uniformly continuous. [AcL(H) is positive if A=A* and (Ax,x)>0 for all x in H.] (A=A* is implied by (Ax,x) real valued for all x in H). Proof. 1) (Sketch) The positive square root function is a monotone function when restricted to 2) P(H). (0,co) The square root function defined on is uni- formly continuous. 3) Therefore if {An I and {B'} n are sequences of posi- tive operator such that || An-Bn|+ then there is a N0 < o n > N 0 An < Bn + cI 4) /X~ v/B n and if e > 0, such that for and B < A + n n + EnI = /n~ n B (6) n + nA (6) n 0 /~ + EI. n < /A n + where lim"|B (E)|| E:-* 0 n = 0 independent of Bn* We are now in a position to list our assumptions and to prove our results. For the convenience of the reader, we have followed much of the notation of White [51). We have also borrowed liberally from him on the wording of our assumption. Al) The model is (6) Yi = X i0 (7) E(c.) (8) E(e. 2 = known to be + L. 0 = a. 2 = Z r0 i = 1,2,...,n i = 1,2,...,n 165 Where X. is a lxk vector of random variables, s. and Y- are %0 is real valued random variables, bers. Y a kxl vector of real num- and X. are observable, Es is unobservable and 60 is to be estimated or hypothesis concerning 0 are to be tested. Zi is a lxm vector of real valued random variables which may contain some or all of the variables in the vector X-. 10 is a mxl unknown vector of real numbers which is to be estimated or hypothesis concerning F0 are to be tested. Let W be the vector of length p of random variables whose first entry W is the scalar 1 and whose other entries are exactly those random variables that appear in X. or Z. . We 1 assume that E(W *Wir£i) = 0 1 < jk < p and E(W T( We let y s 2_ a 2 denote The vectors (W.,E) 1 2 2 )=0. are assumed to be a sequence of independent though not necessarily identically distributed random vectors. A2) I) There exists 0 < 6 < a) qnd A a)EJSW2 W . 1 +6 ) '< E(I 2 Wir w Wis WitWiv 1 < suc i- h tA at for all i: r, ,t,v < p b) E(ISEWikWirWisWitWiv 1+6) 1 < k,r,s,t,v c) E(JW 1 < j,k,r,s,t,v ( p d) E( yj2!') e) E( f) Ejc II) Let WikWir W isW itWiv1+) E? f < WWit1) < a 1 ( r,s,t W.irW. is ) <A Ra = n 1 n n E(XX.) i i=l Mb n = < p n 1 Z. i=1 1 E(ZTZ.) i i and let a < p r,s < p 166 We assume that there exists NO < o and 0 < X such that for n > N0 minimum eigenvalue of Fa > X and minimum eigenvalue > X > 0 (Note by the first part of A2) this is equiva- of R lent to the property that for n sufficiently large det Ra and n det M is bounded away from zero. Also, observe that we can 6 choose A3) and X so that they are equal. Let Vn = n -- E E(e 2X let Vb = n 1 ni=1 2 ZT 1 X.) and 1 We assume that there exists N0 < c and X > 0 such that for minimum eigenvalues of V> n > N X > 0 and minimum eigenvalue 0-n > 0. of Vnb > that NO,% (There is no loss in generality in assuming is A2 and N0 ,X in A3 are the same. In the presence of A2, A3 is the equivalent of the assumption that for n sufficiently large minimum (det a , det V ) is bounded away from n n and above zero. The first theorem is a restatement of a result found in White [ ] and its proof can be found therein. Before stating Theorem 1 we introduce additional notation. (X TX) Let s XTY if CXTX) is nonsingular n = 0 if (X X) is singular (ZT Z) ZT2 if (ZTZ) is nonsingular Let an 0 E is the nxl column vector whose i Let Ein Let if (Z Z) is singular = i - X .n - Z ia entry is ( i) 2 167 Let ga 1 n = n ^2 T i=l in i Let Vb = nl z.n n 2 i Z. i=1lin i i Let Ra be a qxk matrix of real numbers with full row .a rank and let r be a qxl vector of real numbers. Let R be a qxm matrix of real numbers of full row rank and let r be a ax1 THEOREM 5.5 vector of real numbers. (White) Under assumptions Al, A2, and A3, we have the following: i) Sn o ii) an r0 a.e. T 1 iii) r iv) V) [ (n almost everywhere a T An~$o) Yb (TZ)l Z -1 4 under n the hypothesis n n H o: n n(Ra nr a T [R a vi) TA )Xn-l ; [(n) -1 under the hypothesis H : n(Ra T -r) (a.e.) an-a ) rQ N(0,Ik) A N(OIM) Rao =0 rna X RI' q ~ )R aT 1-l(Raan-r a A X2 = r a 1 ^b Z Z -lRT (Rn () n ( n I (Rn-R -r) A 2 X x 168 Before stating this theorem it is once again necessary to introduce additional notation. S2 th be the nxl column vector whose i entry is n Let En Letn 2 =(ZTZ~ ZT 0 Let N.in Let 9cn ' 2 =Ci in - E. 2 in . if ZTZ is nonsingular otherwise Z3.n and n~1 Z n 2Z. T Z. i=1in i i THEO.RFM 5.6. Under assumption Al, A2, and A3, the following hold: i) rn r a.s. T n ii) iii) vri [( n )-l vc(Z n n -1J- 2 n~Io) At N(OIm) M F-o under the hypothesis H0 : R' 0 =r (where R,r are as in theorem 1 ZZ-c (Rrn r) T [R(-)n.nRn n Vn n 2 ZZ-l -TR T-l 1- (Rr nrr) n n-r ^V Xq (Note: A part of the statement of this theorem is whose inverses must be taken 4 almost everywhere). Proof. 1) as n + oe- that matrices nonsingular See Chapter IV. The idea of the proof is quite simple. We show that the difference is norm between the statistics stated in Theorem 2 and the associated (noncomputable) statistics in Theorem 1 converges to zero as n + w almost everywhere. Hence by 169 proposition 3 and theorem 1, the desired results hold. Unfortunately, the only proof I know involves a large amount of computation. 0l 2) 11 F r | rn n||n, it suffices to show 3) + TA^2 0 a.e. ZT(eFn2_E2 -1 Zy ^^ r n-a n 1 xn-rO 11 and so by theorem I, + rn < n 2 n 11 (ZTZ)1 11 T n-an ZT ^ 2_ 2 Z n E ) (n n By SLLN (Prop. 1)11 Z By A2 for n > N0.' - nl E(ZTZ.)I| 0 a.e. n i -1l1 [n1 E.=1 E(Z Z )] -_ * It follows from standard Banach Algebra techniques that if l TZ)| E(Z )-n2 ZTZ -1 (1n ) 2 <T. ZT ^ 2_ ZT 2 n n1 ZT n 1 n2 Z TX0 Zn n-1lZn n Xi ~ n1-1IEn il Therefore, to show i) it is enough to show 0 a.e. -+ 2) is invertible and < = + E- Xi2] ZT A 2- )2 [(Y Z n n 2 + A~ [Xi"(%-n)Xi(%-n) + 2s 0 Z [X (6 n0 0 n) + X.02 Tm m i r= 0ir nrX is0 n s ) c 1 ir 0 n r Z (E:-E2 ) is a mxl vector and m is a fixed finite number; thus to prove convergence to zero, it suffices to prove component wise convergence to zero. Furthermore, since the sums indexed by r and s are of a fixed finite length, we are Now n reduced to showing. 4) n- (E 1ZirX Xis) 0 nr( n +2 Zir(iir 0 n r 170 (1 < k 1 r,s < m). converge to zero a.e. This is an immediate consequence of A2, proposition 5) ylim fl If li I XY CnDnII AnB n = 0, it suffices to show: - Cn a) plim IAn = DnII=O b) plim IBn c) JI, then to prove X!Y > for n such that wand N(s) < m N (e) < M(C)} prob {IIAn < M(E) there exists for any s > 0, > 1-c and prob {JJ Bn 1 < M(e)} > 1- s This is true since AnBn < IIAnD l IIBn - DnI1 + An-Cn11 < 11An I1 Bn 6) - DnI = above, to show plim V'F (n AnDn + AnDn - CnDn Bn1 + JB n1 - n 1 Z ,.n,Z-l nn AnBn n 1 + IAn - Cn 1(I1 Dn ( na n) /n CnDni - an ZT ( n 2_2 as w e saw in 3) 0 we are reduced to showing for 1 < k,r,s < m a)plim o- Rn r) Zir XirXis ) No~ n) s n b) plim (n 1 1 (So VM (Va n o Zikir b) M~a) n n o) (M- on ns nr i= ioki n r ) ) Va 0o= 1 o N(0,I). -1 .<a- u Assumption A2,I.a. ensures us that Cnis uniformly bounded. 0 171 for any c < 0, n > N(E) implies probability { iIvl(6 n ) n 0-- there is a M(c) < o and an N(c) - Since (n-60)-+O E a.e., n < M(e) } > 1-E. - 0 a.e. and since -ZikXir+ there exists M < o such that (n fl T -1c Vcn _ 8) n n[ n n n b 1 (Z_ -1 b-n~ zn =1 n 0. = TT TC n a.e. n n n n n n Ij< M n.E il ZikXi X. 6a and 6b hold, and plim V/i (Fn-a) co), < o such that in 2 in 2 )ZTZ. 3. It is necessary to show that plim 1|^ n _ n n | =0 Unfortunately this requires the following long computation. 9-AC is a mxm matrix and m is nn show for 1 < j,Z < m plim n Z zi Z Since 2 win - v 2Win ~in2) ='0 2 ^ 2 ^ ^ I - Z an + Z (an ~rn ^ Zin a fixed finite number + in 2 + [ei 2 n {[Z i (an Z n - z Zi-1zjziz n rn ) i.n 22 I) C in 2 i 2 A Vin 9) 2 in' Ci n ) ^ n Z z 33, in 2 in ^ (c.--Z. a1ni.)Z 2) 2 ^ (a - d )+2 (E ii -_Zian)(E. n. 1-E.)+2Z. a -i( ia i n n. 2 2 . - i:. } in It is now convenient to proceed term by term 10) n -l m n Z Z1 m r=l s=l(n ^ Z ^ 2 [z(an)J 2n= m _1 zi ijZiZ'ZirZi-))(ann (s -nnr 172 Since - a -inn 0 a.e. -n||l E Z Z [Z (an (n rn)] 2 - 0. + it o), -+ follows that (n a.e. + o) (11) i. 1 =i in i2= S Xin= ian - 2+ 2 2s. 2 + n X.6+s. i i+ X - - X.Sn ian ==*iX.(G-Un)+s. na- d+ + N- 2 n 2 + 4 = - [X.e n ]4 kr=i=14(ZZ k k 1 nr+ XXins n n i ir s n S Z ijZitXirXisXitXi) k *k k k r=1 s= 1 t=1 v=l(n nt a-an +)- 0 a.e. (n assumption A2, n 1 + o), Z n ns n r so again by 22iE [ Z i=1 ij ii[Ein 0 a.e. (12) n2 n i Z .Zi 2(c.- n m 22(n1 Z. i1 r=1 2Z m m r=1 s=l(n- ^ Zcan)ZC(an 2 Z Zi i ijiZir 2Z - n - rn n(cn)r n r+ + an- n zi=iZijZiZirzis)ans (an~ n r 173 Now a so = F Os + (a nl n-POs) a an o< I r0 + 1 a.e. (n Thus, n nls -1 n z Z. Z i=1 ij o). 2E iz - 1 Z1 a f ii nn n) +a 0 a.e. (13) n2 n =1 n - I j Z Z 2( n i1 Z..Z. k n r=1 (4n k sE - Z ifn 2 I n 2 ) (2Ei X (- Z Z Xir 3 )($-n )r n) + [X.(-n 2 + 2 e Z I ZXirXis n r + n s n Z. s: Z.Z. Z. X. )a~ ($-p + 1= i ij iZ ir is nr n s 4( r=1 s=1 n 2 2(C k k r=1 s=1 2(n m 2 - - ( m k k r=1 s=l t=1 n =2( Z Z Zir is it iz So we conclude nE z -Z 2(c n) nr i n t n s C ) +0 a.e. (14) -1 n Z .Z 2Z n I 2 i 2 (an rn in -1 nA n~ i1 Z Z.2Z.(an -r) (2e.X.-$n) m k r=1 s=1 4(n m k ) nn Z e k Zir is n n r + n s n r=1 s=1 t=1 2(ns nZ + [X (-$n) so n and thus 1 2( Z Z Z ~ ZirXisXit) (an~ 2 -T (.-EQ 0 (s r )($ nr (n a.e. (n n n) - 174 15) 16) 0 | - D - +7 0 |IV = 0cn hence a.e. (n n- EZ - nI + 0 -*- c). Z )-| E1( a.e. (n -+ 1 For n sufficiently large, (ViY)nn < ($ ) Therefore a.e. implies the existence of T 17) z nB iT nZ - | - ) (Z a.e. (n +) ) . exists and - | (V'II<1 ) which of course w) , (V ) a. e. (n CO) . - By assumption A2.c) there exists M < w such that 11Mb all n, hence I 18) (u :S By Proposition 1, i 0 (n 0 )j a.e. as n < M+1 j <M -. Furthermore, by assumption, for n sufficiently large, ( ) -1 a4 . T T YZb (Z ZZ) 19) We conclude then that 2 2 1 n < (M+l) a.e. (n -+- cc) and that 20) 21) ) [( nn By 20) [( n ) Z b -l Z 1 -c( vn n n )- 1 n oZ~l--L, a. e. and Proposition 4). nb n ) -1 [( n -v c n n ZP) -1--]|| 0 a.e. 175 22) It follows from White < n Mn (a -r there exists M(E) <.m, such that for n > N(E) prob {'a 24) --b ~ that [ J, N(0,I ) and hence for any c > 0, N(s) -b- llan~r o M(E)} >l-E Recalling 5) we see that pli 1 b Z [ -(r [rn )- nn -~ (Z -0- n n where we need only note|| v/ (no r. o) n r nn Thus part ii follows. R( n )- T T T R 2 1 (Zc _bl (ZZ) -ylTn n n b n c n -l2 n hence R Z c Z-lRT - n n n ) n RT b Z n n (n + o) 28) Let P be a positive mxm matrix and jP- II < S Now R is a qxm matrix of full row rank; hence there is a 6 > 0 inf 11RTX I > S. Xe X11 ||=1 <RPRTX,X> = <PRTX'RTX> = hence RPRT is invertible and R TX 112 11(RPRT-l _ 6 P 6 176 S 29 - zT V n(-n -) Z n -l Vc Z n n T where i is a positive operator, such that ( - -l < ) M+1 (M+l) 2 2 a. e. a.e. (n -+ ). (n -+ co) Hence we conclude [R( n ) 1 Z c n - lRT l - n [R( n ) Yb n ) -lRT n -lD+0 a.e. 30) |v| (Rn-r) -- (Rrn-r) < l R plim IVrn(R n-r) - V Under H0 ;V(Ra n-r) ( vln-(an r) T- (RPn n)O (R n-r) n plim 11v(R nr)T -n- 31) II = i Rr n-r)TI| = 0 VW(R& n-R 0) = Rv(Q -^ro); thus under H0 , for c > 0, there is a M(e) < N(c) < o such that n > N(c) prob {fn(Ra n and implies * Observing that our norm structures are such that X II| < || XII | Yl|| 11ABC-DEFH| 1A -DII and hence that <| Al] 11B|| 1| CF | + | Al FH IIB-El Ell 1 F| , we have just shown that: + 177 32) Under H0 , | T (R n-r) V r- (Rrn-)(RT and so part iii [R ( T ) z z Z ( R -l T -(RTn T ^CT -1 T -1 n- r)T= follows. [X] Thus we have shown that we have a valid asymptotic test for testing linear restrictions ance model. additional Our next and last assumptions, on the parameters result is in the vari- to show that under we can use our estimates from the vari- ance model to reestimate the original model and obtain an estimator that is asymptotically equivalent to weighted least squares with variances There exists A4) known. X > 0 such that for all i a3 > X. A5) For all i, E( 1Wis, ...W p) = 0,E( i2 wil, ... Wip) = Z A6) There exists M < c, such that for all i Z J< M. THEOREM 5.7. Let 2 nn denote the nxn diagonal matrix whose is a. and letI Z rn. n denote the n n matrix whose (i,i) entry (i,i) entry is Let Bn denote the Aitken estimator given by (XTn -lXIQ nn ~1 Y where XTh n -lX is nonsingular Bn 0 if XTnn~1X is singular. Let Bn be the weighted least squares estimator given by 178 -lY) 1 -l (T-g (T-1 Sn B n if Xg - lX is nonsingular = iX X if 0 singular. - QT n1 X is Under assumption Al-A6. i) -S 0 a.e. 0 + plim v(B n-B ) = 0 and ii) A (Bn~o) A N(0,Ik0 lXT^nl n If R is a qxk matrix of real numbers with full row iii) a qxl vector of real numbers, then under Xn ^ -l -lT-l A 2 ) R ] (RBn r %Xqn n(RB -r) [R( lq nn rank and r is H0 R 00n = r, Proof: 1) Consider the transformed model = X + E a. So a) E( i=l,...n and a. dP = J1) a2 = G2 1 x. .sC. 1] z.r 1 dP 1 0 X. . E (E w .Z dP = 0 ip jillii- b) c) b) E. E(-) E( = E(. S. . o 1-2 - ) 10 aZ i zi 1 < j 1 iro E(Ew .. = f1 < k o wip)dP = 0 Z.r i o Z r dP 179 2) The OLS estimator on the transformed model is B Since assumptions A4 and A6 insure that 0 < X < a 12 < M|il 0< c> for all i, so all the assumption of White's theorem are satisfied and we conclude a) B n~o+ b) rn(n~ iXTQ c) under the hypothesis H : 0 0 n(RBn-r) 3) T a.e. 1 X)k(Bn~o A N(0,Ik T -1X)1T-R [R(XT n T% T -1 the case X £n~ X and X 2n In Bn-Bn = (XT _ (XI -lX) -1-T n -X 4)n~X 1 n -n X n A 2 X are both nonsingular, -l (X+E) T n X) -lXT T1(n 1 -1 XT nn)( X Q X -l 1 1=T XT n r XT Q n Tl l -1l nXQE whe wherei~ a 2 in = rn zir n in nin ni 1 5)XT n-1i n n iti T tin o 2i2 so ar am 2 ft T S1-ixis akxk matrix where kis n finite number. To show lixT0 §lx a - XTQn- 11-) 0 fixed a.e., it suff ices to show convergence for each matrix entry. 180 6) nn1 X. X ( 1 ) I ] ik in CF . Cin n max ai l<i<n 7) 1+1 a. - 1 a. in Zi (ro -n) Z i > I in = 1 Zirn I > Dziroli a.e. ZinI > I o) i and and hence (n ) - and 0 Mpfn-Ioi supi 1 l i~n a. a. so i orI - lZi(ro-fn) HZ in X - Miro-fnH Z (Tn~ Thus 1 T io ) A2(Z I IXinXi in Z i'r'ni + Z in ri=2. n A2 1 a. e. a. e. (n + o ). (n +co). a. in Therefore by proposition 2 and assumption A2', for 1<j, in 1 Z Xn (1- --)+I i n1 T(X n X) - - 0 a.e. (n 0 a.e. -+ w) and hence in n (X n~-X) (n Z<m, + c) 181 (8)T Q)ZZ> Let Z E 3R; <n 1 (XTn n = n~-<4n~ XFZ <2n XZ,XZ> 2 > n' > n 1XZ m ||T <n 1 <XZIXZ> I m Tlo XTxzz> 1 <n lxTxzz> M Broil IZ 112 1 (XTx)1H jj[n- M 10 H (providing these exist). For n sufficiently large-I] [E(n~ XTX)] 111 <n ~1(X T Sn~ X)Z,Z> > a. e. 1 Z 12 M 1 ff1r n~~ (n +) co ) . and thus [ n~ (XT0n 1 In[n~(XT and l X)] M2 2 X)] E - XT n (n a.e. [ n~l(X - ) - 1|1 nX) 0 a.e. (n + X X. s c) (9) n [(XT n n -n - E)] XT( ii 2 - 1 n~ [ XTT XT1 - E] 22 in is a kxl vector so again we need only show coordinate convergence to zero. In_ .C)I<( - a. I . ai in MAX in n 1 n- I a. I a.n in .1 182 The-refore~ we conclude . X1 1nE =nX 21 ZT n 1T 1-1 | - n P 1 0 a.e.(n- o). n ij therefore e i, X there exists C < a such that 1|n~XTQn 10) Hence $n-Bnn A |I and | {Recall 11) 0 +I 0 |UJv-xy T[n~ XT plim plmvr- a.e. < C eC -1 (n +) w) . a.e. (n + c) a.e. (n + 0) iIV-YH + llu-X l(HV U ) e] - (part i) [n~X Tn~ ( n n-XMn- | + I|y-V I|)). is a kxl vector, to show n)s) = n 0, it suffices to show that this holds by coordinate. v n~ [n ~ 1 CFi X X...a. En n 1 ain 1 2 2 v~-Z 1 [ nnor iin m _- r=1(n n X. e Zr Ei= 1 2^ 2 Siin n-o r 12) We have already seen in the proof of theorem 2, that for any E < 0, n > N(c) there exists M(s) implies prob. < c and N(c) {v/ ||l rn- O < o such that -ol M(e-)} > 1-6. 183 We now must show that i= 13) Zir 1+6 E ) < 1 1 41 i A2. j 0 a.e. 1 4 2-i2 1 < r 4 m k a. a. i.X E r) n X..e.Z. n1j i ir (where 6, A are as in assumption Therefore by proposition 1, -1 n c.X.i .E in (n n XZ E:.X. .jZ.i - n j=JE( 4' 1 + 0 a.e. and hence +u) n 2 -2 0 ir a. e. (n +) co). ai ai n e X- .Zir 2 n -1 Zi=1 14) Xn EX.Z L113 n < 1r 2-2 i=1 + i. in i n113 nZir 2 11 + lz c n X.Zir MAX + 1 a. So -1 vrn[n~ 13 nX X) I1v (n~ X1 n VlV (Bn -)I - - n (Bn (n~ X a.e. (n ~10 - n T ) - Qn~X)k - -+ a. =) )] ^ n iXT n X)2(Bn (n~ XT n X) (n v sX. .Zi in2 2 Cin 0 -+. 2- 2 aiain n 1 l~~ X Z ir n n Ei=1 and plim 15) aiin I ~<o + 184 (I V(Bn-B n) - (n~1XTn -lX) | + 1 ~^XT^ (nlXT || v/-n- (R%n-r ) - v n(RB n-r ) || < so plim I vj (R n-r) follows. En (B^n-B n) and (RBn-r) 11= 0 V - R ' -1X)'1) therefore plim | V(n~1 X T X) k(B 0 nl n-+ v'ii(n~ XT n 1 = 0 and part ii Bn-60) 16 ) X) n +)- plim ||/ (R n-r) T 6(RBn) - n +*c Under the hypothesis Ho: exist M(s-) prob 17) < o N(-) {|fi 11 (nXT n X) f| (RB n-r) - 1 - for an e > 0 there R$ 0=r, - (n~ ( implies M(E)} R(n~lXT n - XTn n 1 X) 1 0 = < o such that n > N(6) R(nXT n-lX) -lRT R112 11(n T T1 X)~lRT - (n-X Qn-X XTQ n~X)H + 0 a.e. (n -+ co) (from 8) , Therefore (n~XT -[R n1 -1 R ] -[R (n-1 T-£ -l [Rn x -l Rnl T 0 (n 18) R T is a mxq matrix of full column rank, + a.e. ) 3 therefore there 185 exist 6 > 0 such that IR TZ j> 61ZI for all <R(n1 X TnnX)RT Z,Z> = <(n >n- iT The n-1 nn 62 XT n X RTZ,RTz> n X (s,t) component of n X) ZE3Rm i = ln t il n so there exists C < w such that In 1 XTn~1 X C C a.e. (n -+ oo). < a.e. (n X Qn X)~nRT] +- 1RT -l Therefore I [R(n~1XT n~X) +) and X)~RTV- [R (n~ XTn - [R(n 1 0 a.e. (n no (n +~ cc). We can therefore conclude that under H : 0 19) plim |ln(RBn T [R (n~ n X T n-X)1RT - n - n(RB n-r) [R(n ~XQ n-r) n n~ X) ~R] (RBn-r) j = 0 and so by proposition 3, part iii holds. [Xj CHAPTER V SUMMARY AND CONCLUSIONS The state of the nation's housing program, particularly those for the urban poor, should convince even the skeptic that we do not have a firm understanding of the housing market. unique among economic markets. This market is Patterns of resi- dential development have a profound effect upon the social and economic development of the family, the municipality, and the nation as a whole. Urban economics and planning professions must seek to develop new theoretical and empirical procedures to help us analyze the housing market. In the preceding chapters, I have presented new theoretical procedures that should be of interest to the urban economist and policy developer. meant to be nor is it a finished tome. It is not Rather it points to new directions for future work. In this last chapter, I'll briefly discuss directions that future research might take. The search model presented in the first essay is not completely analyzed. It would be of great interest to build into it a mechanism for replacing 186 187 buyers and sellers and then examining the implications over tine for an individual buyer and seller resulting from market aggregation. Ultimately one desires to have a search model where bid structure, sellers behavior, and search strategy are all endogenous. One would then be able to examine the time paths of the buyers and sellers behavior as well as that of the market as a whole. It would be advan- tageous to be able to determine the effects on this model from altering the distribution of incomes of buyers and from adding more buyers and sellers from different income groups without having to replicate each agent. If we had this type of model, we would be in a much better position for understanding effects of discrimination on market prices of housing as well as in neighborhood residential patterns. While White's work and that in the third essay may seem to answer the questions of estimation and hypothesis testing in the presence of heteroskedasticity, this is far from the fact. The results that we have derived, and those in the second essay also, yield asymptotic properties. What is clearly needed are small sample properties of the various estimates and statistics that are designed to deal with 188 heteroskedasticity. As computers become more power- ful and more available, maximum likelihood procedures become more accessible. It would be of great value to know the circumstances under which each of these estimators dominates in the small (finite) sample case. It is clear that urban policies and programs have little chance of success until their designers gain a better grasp on the behavior of the urban housing market. While much theoretical and empiri- cal research remains to be done, there have been large gains in developing the theoretical and technical tools necessary for effective analysis of the housing market. However, the gains made -in the development of these theories and techniques will be insubstantial unless the policy and program planners develop their technical skills and mathematical maturity sufficient to understand and utilize the theories. BIBLIOGRAPHY 1. Alonzo, W. Location and Land Use. Harvard University Press, Cambirdge, Mass., 1970. 2. Arrow, Kenneth, and F. H. Hahn, tive Analysis, Holden Day, Inc., 1971. 3. Ash, Robert B. Real Analysis and Probability. Academic Press, New York, N.Y., 1972. 4. Box, G.E.P., "Some Theorems on Quadratic Forms Applied in the Study of Analysis of Variance,"I Annals of Mathematical Statistics, 1954. 5. Brown, A., and Pearcy, C. Introduction to Operator Theory, I: Elements of Functional Analysis. New York, Springer-Verlag, 1977. 6. . Introduction to Operator Theory II. (In preparation). 7. Butters, G. "Equilibrium Distribution of Sales and Advertising Prices." Review Economic Studies. October 1977, vol. 44, 465-491. 8. Chow, Y.S., Robbins, H., and Siegmund, D. Great Expectations: The Theory of Optimal- Stopping. Houghton Mifflin Co., Boston, Mass., 1971. 9. Chung, Kai Lai. A Course in Probability Theory. Academic Press, New York, N.Y., 1974. General CompetiSan Francisco, 10. Diamond, P. "Wage Determination in Search Equil.ibrium," Working Paper 253, Department of Economics, M.I.T., Cambridge, Mass., 1980. 11. Diamond, P.A. and Stiglitz. "Increases in Risk and Risk Aversion." Journal of Economic Theory, 1974, vol. 8, 337-360. 12. Downs, George W., and Rocke, David M., "Interpreting Heteroskedasticity," American Journal of Political Science, vol. 23, no. 4, November 1979 pp. 816-828. 189 190 13. Feller, William. An Introduction to Probability Theory and Its Applications. Volumes I and II, John Wiley and Sons, New York, N.Y., 1957. 14. Glejser, H. "A New Test for Heteroskedasticity." Journal of the American Statistical Association, 64 (1969), 316-323. 15. Goldfeld, S.M., and Quandt, R.E. "Some Tests Journal of the American for Homoskedasticity." Statisical Association, 60 (1965), 539-559. 16. Halmos, P.R. A Hilbert Space Problem Book. New York, Van Nostrand, 1967. 17. Ito, K. and Schull, W.J. "On the T 2 Test in Multivariate ance When Variance-Covariance Equal." Biometrika, vol. 51, 18. Kmenta, Jon. Elements of Econometrics. 1971, Macmillian Publishing Co., Inc., New York, N.Y. 19. Kohn, Meir G. and Shavell, Steven. "The Theory of Search." Journal of Economic Theory, 1974, vol. 9, 93-123. 20. Kosters, Marvin and Welch, Finis. "The Effects of Minimum Wages by Race, Age, and Sex." in Anthony H. Pascal, ed. Racial Discrimination in Economic Life, D.C. Heath, Lexington, Mass., 1972. Lippman, S.A., and McCall, J.J. "The Economics of Job Search: A Survey, Part I: Optimal Job Search Policies." Economic Inquiry, June 1976, 155-189. 21. the Robustness of Analysis of VariMatrices are not June 1964. 22. Lippman, S.A., and McCall, J.J. "The Economics of Job Search: A Survey, Part II: Empirical and Policy Implications of Job Search." Economic Inquiry, September 1976, pp. 347-368. 23. Lucas, R.E. and Prescott. "Equilibrium Search and Unemployment." Journal of Economic Theory, 1974, vol. 7, 188-209. 191 24. Malinvaud, E., Statistical Methods of Econometrics, 2nd. Edition, English translation by Mrs. A. Silvey, 1970, North-Holland Publishing Co., Amsterdam. 25. McCall, John J. Income Mobility, Racial Discrimination and Economic Growth. D. C. Heath and Co., Lexington, Mass., 1973. 26. Mieszkowski, Peter, and Straszheim, Mahlon, Current Issues in Urban Economics, 1979, Johns 27. Mills, Edwin S., Studies in the Structure of the Urban Economy, Resources for the Future, Inc., Johns Hopkins Press, Baltimore, MD., 1972. 28. Mirman, L.J. and Porter., W.R. "A Microeconomic Model of the Labor Market Under Uncertainty." Economic Inquiry, June 1974, vol. 12, 135-145. 29. Montgomery, Roger, and Marshall, Dale Rogers, Housing Policy for the 1980's, Lexington Books, D.C. Heath and Company, Lexington, Mass., 1980. 30. Mortensen, Dale T. "Job Search, the Duration of American Unemployment and the Phillips Curve." Economic Review, December 1970, 847-862. 31. Muth, Richard, Cities and Housing, U. of Chicago Press, Chicago, IL., 1969. 32. Norse, Hugh 0., The Effect of Public Housing on Housing Markets, Lexington Books, D.C. Heath and Company, Lexington, MA., 1973. 33. Oberhofer, W. and Kmenta, J., "A General Procedure for Obtaining Maximum Likelihood Estimates in Generalized Regression Models." Econometrica, vol. 42, no. 3, May 1974. 34. Park, R.E. "Estimation with Heteroskedastic Econometrica, vol. 34, no. 4 Error Terms." (1966), p. 888. 35. Polinsky, A. Mitchell, and Elwood, David T., "An Empirical Reconciliation of Micro and Grouped Estimates of the Demand for Housing," The Review of Economics and Statistics, LXI, #2, May, 1979. 192 36. Prais, S.J., and Houthakker, H.S. The Analysis of Family Budgets, The University Press, 1955, Cambridge, England. 37. Rao, C.R. Linear Statistical Inference and Its Applications. New York, John Wiley and Sons, 1973. 38. Reid, Margaret, Housing and Income, University of Chicago Press, Chicago, IL., 1962. 39.. Rothchild, M. and Stiglitz, J. Increasing Risk: I. A Definition." Journal of Economic Theory, 1970, pp. 225-243. 40. Scheffe, H. The Analysis of Variance, 1959 John Wiley, New York, N.Y. 41. Smith, Barton, and Campbell, J.M., Jr., "Aggregation Bias and the Demand for Housing," International Economic Review, 19 June 1978, 495-505. 42. Smith, Wallace F., Housing, University of California Press, Berkeley, CA., 1971. 43. Solomon, Arthur P., Housing the Urban Poor, A publication of the Joint Center for Urban Studies of the Massachusetts Institute of Technology and Harvard University, 1974. 44. Stigler, G.J. "The Economics of Information." Journal of Political Economy, June 1961, 213-225. 45. . Information in the Labor Market." Journal of Political Economy, October 1962, 94-104. 46. Straszheim, Mahlon R., An Econometric Analysis of the Urban Housing Market. National Bureau of Economic Research, Columbia University Press, New York, N.Y., 1975. 47. Telser, Lester G. "Searching for the Lowest Price." American Economic Review, May 1973, 40-49. 193 48. Thiel, Henri. Principles of Econometrics, 1971 John Wiley and Sons, Inc., New York, N.Y. 49. Tiebout, Charles M., "A Pure Theory of Local Expenditures," Journal of Political Economy, 64, October, 1956, pp. 416-424. 50. U.S. Bureau of the Census, Statistical Abstract of the United States: 1979, 100th edition, Washington, D.C., 1979. 51. United States Commission on Civil Rights, Twenty Years After Brown: Equal Opportunities in Housing, U.S. Commission on Civil Rights, December, 1975. 52. Varian, Hal R. "A Model of Sales." The American Economic Review, September 1980, vol. 70, #4. 53. Weicher, John C. Housing, American Enterprise Institute for Public Policy Research, Washington, D.C., 1980. 54. Welch, B.L. "The Significance of the Difference Between Two Means when the Population Variances are Unequal." Biometrika, 1937. 55. Wendt, Paul F. "The Role of the Federal Government in Housing," No. 460 in the series "National Economic Problems," American Enterprise Association, Inc., Washington, D.C., 1956. 56. Wheaton, W.C. "A Bid Rent Approach to Housing Demand." Journal of Urban Economics, 1977. 57. . "Income and Urban Residence: Analysis of Consumer Demand for Location." American Economic Review, vol. 67, no. 4, September 1977, pp. 620-631. 58. , An The Milgram, Grace, and Meyerson, Margy Ellin, Urban Housing, The Free Press, MacMillan Company, New York, N.Y., 1966. 194 59. White, Halbert. "A Heteroskedasticity Consistent Convariance Matrix Estimator and a Direct Test for Heteroskedasticity." Econometrica, vol. 48, 1980, pp. 817-838. 60. Wohlstetter, Albert and Coleman, Sinclair. "Race Difference in Income." in Anthony Pascal, ed. Racial Discrimination in Economic Life, D.C. Heath, Lexington, Mass., 1972. 61. Yinger, John. "A Search Model of Real Estate Broker Behavior." The American Economic Review, September 1981, pp. 591-605. 62. Anas, Alex, "A Model of Residential Change and Neighborhood Typing" JUE 7, 1980, pp. 358-370. 63. , "A Probabilitic Approach to the Structure of Rental Housing Markets" JUE 7, 1980, pp. 225-247. 64. Bailey, M.J., "Effects of Race and Other Demographic Factors on the Values of Single Family Homes," Land Economics 42, 1966, pp. 215-220. 65. Burstein, Nancy, "Voluntary Income Clustering and the Demand Housing and Local Public Goods" JUE 7, 1980, pp. 175-185. 66. Carliner, Geoffrey, "Income Elasticity of Housing Demand" RES 4, 1973, pp. 528-532. 67. Chinloy, P., "The Estimation of Net Depreciation Rates on Housing" JUE 6, 1979, pp. 432-443. 68. Courant, P.N. and J. Yinger, "On Models of Racial Prejudice and Urban Residential Structure," JUE 4, 1977, pp. 272-291. 69. Dusansky, R., M. Ingber, and N. Karatjas, "The Impact of Property Taxation on Housing Value and Rents, JUE 10, 1981, pp. 240-255. 70. Dusenberry, J.S. and H. Kristin, "The Role of Demand in the Economic Structure" Studies in the American Economy, edited by W. Leontief, Oxford U. Press, N.Y., 1953, pp. 451-482. 195 71. Erekson, 0. H., and A. D. Witte, "The Demand for Housing Comment" Southern Economic Journal, 46, 1979, pp. 641-648. 72. Fare, R. and J. Y. Bong, "Variable Elasticity of Substitution in Urban Housing Production" JUE 10, 1981, pp. 369-374. 73. Farley, R. S. Bianchi, and D. Colasanto, "Barriers to the Racial Integration of Neighborhood: The Detroit Case." The Annals of the American .Academy of Political and Social Science 44, 1979, pp. 97-113. 74. Follain, J. R. and S. Malprezzi, "Another Look at Racial Differences in Housing Prices," Urban Studies 18, 1981, pp. 195-203. 75. Grieson, R. and J. White, The Effects of Zoning on Structure and Land Markets," JUE 10, 1981, pp. 271-285. 76. Herbert J. H., " Model Estimation and Verification--Some Recent Approaches." Empirical Economies 4, 1979, pp. 87-99. 77. Hirsch, W. Z., and Cheung-Kwork Law, "Habitability Laws and the Shrinkage of Substandard Rental Housing Stock." Urban Studies 76, 1979, pp. 19-28. 78. Kain, J. F., and J. M. Quigley, "Housing Markets and Racial Discrimination," National Bureau of Economic Research, New York, 1975. 79. Kaluzny, R. L., "Changes in the Consumption of Housing Services: The Gary Experiment" The Journal of Human Resources, 14, 1979, pp. 496506. 80. Kau, J. B. and Sirmans, C. F., "Urban Land Value Functions and the Price Elasticity of Demand for Housing," JUE 6, 1979, pp. 112-121. 81. Kern, Clifford R., "Racial Prejudices and Residential Segregation: The Ginger-Model Revisited." JUE 10, 1981, pp. 164-172. 196 82. King, A. T., "The Demand for Housing: A Lancastrian Approach" Southern Economic Journal 43, 1976, pp. 1077-1087. 83. Kraft, J., and A. Kraft, "Benefit and Cost of Low Rent Public Housing" Journal of Regional Science, 19, 1979, pp. 309-317. 84. Lee, T. H., "Housing and Permanent Income: Tests Based on a Three Year Reinterview Survey," RES 50, 1968, pp. 480-490. 85. deLeeuw, Frank, "The Demand for Housing: A Review of Cross-Section Evidence" Econometrics 53, 1971, pp. 1-7. 86. MacMinn, R., "Search and Market Equilibrium," Journal of Political Economy, 88(2), 1980, pp. 308-327. 87. MacRae, C. D., and M. A. Turner, "Estimating Demand for Owner Occupied Housing Subject to the Income Tax" JUE 10, 1981, pp. 338-356. 88. Maisel, S. J., Burnham, and John Austin, "The Demand for Housing: A Comment" RES 53, 1971, pp. 410-413. 89. Miesykowski, P., and R. Syron, "Economic Explanations for Housing Segregation," New England Economic Review, Federal Reserve Bank, Boston, Mass., Nov/Dec 1979, pp. 33-39. 90. Mills, D., "Segregation, Rationing and Zoning," Southern Economic Journal 45(4), 1979, pp. 1195-1207. 91. Murray, M., "Tenant Benefits in Alternative Federal Housing Programmes" Urban Studies 17, 1980, pp. 25-34. 92. Myers, S. J. and K. E. Phillips, "Housing Segregation and Black Employment: Another Look at the Ghetto Dispersal Strategy," AER 69, 1979, pp. 298-302.. 197 93. Phillips, R. S. "A Note on the Determinants of Residential Succession," JUE 9, 1981, pp. 49-55. 94. Polinsky, M., "The Demand for Housing: A Study in Specification and Grouping." Econometrica 45, 1977, pp. 447-461. 95. Power, T., "Urban Size (Dis)amenities Revisited," JUE 9, 1981, pp. 85-89. 96. Reschovsky, A., "Residential Choice and the Local Public Sector: An Alternative Test of the Tiebout Hypothesis" JUE 6, 1979, pp. 501-520. 97. Rose-Ackerman, S. "Racism and Urban Structure," JUE 2, 1975, pp. 85-103. 98. Rosen, H. S., "Estimation Inter-City Differences Urban in the Price of Housing Services." Studies, 15, 1978, pp. 351-355. 99. Rydell, C. P., "Supply Response to the Housing International Regional Allowance Program." Science Review 5, 1980, pp. 119-138. 100. Schelling, T. C., "Neighborhood Typing," in Racial Discrimination in Economic Life, Heath, Lexington, Mass., 1972. 101. Sirmans, C. F., J. Kau, and C. F. Lee, "The Elasticity of Substitution in Urban Housing JUE 6, 1979, Production: A YES Approach." pp. 407-415. 102. Stegman, Michael A. and H. J. Sumka, "Income Elasticities of Demand for Rental Housing in Small Cities" Urban Studies 15, 1978, pp. 51-61. 103. Stevens, Barbara, "Employment Permanent Income, and the Demand for Housing," JUE 6, 1979, pp. 480-500. 104. Vickerman, R. W., "The Evaluation of Urban Change: Equilibrium and Adaptive Approaches," Urban Studies 16, 1979, pp. 81-93. 198 105. Walden, M. L. "A Note on Benefit and Cost Estimates in Public Assisted Housing" Journal of Regional Sciences, 21, 1981, pp. 421-423. 106. Westkoff, F., "Policy Inference in Community Choice Models: A Caution" JUE 6, 1979, pp. 535-549. 107. Winger, A. R., "Housing and Income," Western Economic Journal 6, 1968, pp. 226-232. 108. Yinger, J., "Racial Preference and Residential Segregation in an Urban Model, JUE 3, 1976, pp. 383-396. 109. Yinger, John and Sheldon Danyiger, "An Equilibrium Model of Urban Population and the Distribution of Income." Urban Studies 15, 1978, pp. 201-214.