English translation.

ON STATISTICAL TESTING OF HYPOTHESES IN ECONOMIC THEORY By Trygve Haavelmo: Introductory note by Olav Bjerkholt: This article by Trygve Haavelmo was originally given as a lecture at a Scandinavian meeting for younger economists in Copenhagen in May 1939. Three weeks later Haavelmo left for USA and did not return to to his home country Norway until nearly eight years later. The lecture was given in Norwegian but Haavelmo never published it. It was eventually published in 2008 (Samfunnsøkonomen 62(6), pp. 5-15) . It has been translated by Professor Erik Biørn at the University of Oslo and is now published for the first time. The lecture is quite important in a history of econometrics perspective. Much attention has been given to how, when and under influence by whom Trygve Haavelmo developed his probability approach to the fundamental problems of econometrics. The lecture provides crucial evidence of how far Haavelmo had got in his thinking by the time he left for USA in June 1939. In the lecture, Haavelmo touches upon several concepts that have become extremely important in econometrics later – e.g. identifiability, autonomy, and omitted variables – without using these terms. Words or expressions underlined by Haavelmo have been rendered in italics, and quotation marks have been retained. Valuable comments on the translation of the article has been given by John Aldrich, Duo Qin, and Yngve Willassen. Student of economics Frikk Nesje has provided excellent technical assistance with the graphs. 1. INTRODUCTION In economic theory we attempt to formulate laws for the interaction between events in economic life. They may be purely qualitative statements, but most of them, by far the most important laws, are of a quantitative nature, indeed what we are most frequently concerned with, are quantitative or quantifiable entities. This emphasis on quantitative reasoning is seen in almost any work of theory, regardless of whether the formulation is purely verbal or is given in more precise mathematical terms. The derivation of such laws rests on a foundation of hypotheses. We proceed from certain basic hypotheses; maybe introduce supplementary hypotheses along the way, while proceeding through a chain of conclusions. The value of the results – provided that their derivation is technically impeccable – then depends on the foundation of hypotheses. Indeed, each conclusion itself becomes merely a new hypothesis, a logical transformation of the original assumptions. For this reason I will here use hypotheses as a common term for the statements in economic theory. 1 Anyone familiar with economic theory knows how it is often possible to formulate several, entirely different “correct” theories about one and the same phenomenon. This is due to differences in the choice of assumptions. One often encounters crossroads in the argument, where one direction a priori appears as just as plausible as another. To avoid all becoming a logical game, one must at each stage keep the following questions in mind: Is my argument rooted in reality, or am I operating within a one hundred percent model world? Is what I have found essential or merely insubstantial? Here, the requirement of statistical verification can aid us, preventing our imagination from running riot, and forcing us to a sharp and precise formulation of the hypotheses. This statistical scrutiny saves us from many empty theories, while at the same time giving the hypotheses that are verified by data immensely greater theoretical and practical value. It may seem that we would be correct in sticking to what we see from the data only. But that is not so. Then we would never be able to distinguish between essential and inessential features. Data may give us ideas of how to formulate hypotheses, but theoretical considerations must be drawn upon. On the other hand, we should not uncritically reject a hypothesis even if a data set seems to point in another direction. Many hypotheses, maybe the most fundamental and fruitful ones, are often not so apparent that they can be tested by data. But we can take the argument further until we reach the “surface” hypotheses which are testable. Then, if we are repeatedly in conflict with our data – and in essential respects – then we shall have to revise our hypotheses. But perhaps the data we have used is not appropriate, or we have been unable to “clean” it for elements which are not part of our hypotheses. In the analysis of these various possibilities, the crucial problem of statistical hypothesis testing lies. There are specific testing problems associated with all hypotheses, but there are also certain problems of a more general nature and they can be divided into groups. It is these more general problems I will try to comment upon in the following sections. 2. THE HYPOTHESES IN ECONOMIC THEORY ARE OF A STATISTICAL NATURE Strictly speaking, exact laws belong in logical thought constructions only. When the laws are transmitted to the real world, we must always allow room for inexplicable discrepancies, the exact laws changing into relations of a statistical nature. This holds true for any science, the natural sciences not excepted. In principle, economic science does is thus not special in this respect, even if, so far, there is an enormous difference of degree relative to the “exact” sciences. The theoretical laws we operate with, say something about the effects of certain imagined variations in a more or less simplified model world. For example: how will changes in price and purchasing power affect the demand for a particular good; what is the relationship between output and input in a production process, or what is the connection between changes in interest rates and changes in the price level etc. etc.? As a special case, our hypothesis may state that certain entities are constants, but such conclusions also rely on certain imagined variations. If we now take our model world into the observation field, 2 innumerable new elements come into play. The imagined variations are replaced by the variations which have actually taken place in the real data. Our models in economic theory are often so simple that we do not expect to find any agreement. Such models are by no means necessarily without interest. On the contrary, they may represent a very valuable survey of what would happen under certain assumptions, so that we know what would be the outcome in a situation where the assumptions were in fact satisfied. Other hypotheses may be closer to reality, by attempting to include as many realistic elements as possible. But we shall never find any exact agreements with statistical data. Neither is this what we are asking for. What is our concern is whether certain relations can be established as statistical average laws. We may say that such laws connecting a certain set of specified entities are exact in a statistical sense if they, when the number of observations becomes very large, they approach in the limit a certain form which is virtually independent of elements not included in our model. That such statistical laws are what we usually have in mind in economic theory, is confirmed by the fact that we almost invariably deal with variables that have a certain “weight”. For instance, we do not ask for the demand responses of specific persons to price changes, but rather seek for average responses for a larger group, or – equivalently – the responses of the typical representatives of certain groups (“the man in the street”). We study average prices and average quantities or total quantities for larger groups of objects, etc. The idea is the same as in statistics, namely that the detailed differences disappear in the mass, while the typical features cumulate. But the cases where the “errors” vanish completely, are only of theoretical interest; for practical purposes it is much more important that they almost disappear when considering large masses. When this is the case, it does not make much difference whether we, for reasons of convenience, operate with exact relations instead of relations with errors, e.g., by drawing the relation between price and quantity in a demand diagram as a curve rather than as a more of less wide band (Figure 1). Figure 1 But the circuit of issues related to hypothesis testing is not exhausted by the question of smaller or larger degree of precision in the conformity between data and a given hypothesis. The crucial problems in the testing of hypotheses precede this stage of the analysis. It turns 3 out – as we shall see – that many hypotheses by no means lend themselves to verification by data, even if they are quantitatively well defined and realistic enough. Indeed, we may be led astray if we attempt a direct verification. Moreover, almost all hypotheses will be connected with substantial “ceteris paribus”-clauses which pose particular statistical problems. In addition comes the question of the choice of quantitative form of the hypothesis under consideration (the specification problem), and here also we must usually be assisted by data. Before addressing these various problems it is, however, convenient to take a look at the general principles of statistical hypothesis testing. 3. ON THE GENERAL PRINCIPLE OF STATISTICAL TESTING OF HYPOTHESES Let us consider two observation series, r and x, for example real income (r) and the consumption of pork and other meat (x) in a number of working-class families over a certain time span during which prices have been constant (Figure 2). Figure 2 We advance the following hypothesis (3.1) 𝑥 = 𝑘 · 𝑟 + 𝑏 (k and b constants) Now, we might clearly draw an arbitrary straight line in the (x, r)-diagram and consider the observations that do not fall on this line as affected by “errors”. For our question to have a meaning, we must therefore formulate certain criteria for accepting or rejecting the hypothesis. Of course, the choice of such criteria is not uniquely determined. Supplementary information about the data at hand, with attention paid to the intended use of the result, etc., will have to be considered. Obviously, one set of criteria can lead us to reject, another to accept the same hypothesis. To illustrate the kind of the purely statisticaltheoretical problems one may encounter, we will go through the reasoning for one particular criterion. Let us assume that k and b shall satisfy the condition that the sum of squares of the deviations from the line 𝑥 = 𝑘 · 𝑟 + 𝑏, taken in the x-direction, is as small as possible. The crucial issue is, presumably, whether the k determined in this way is significantly positive (that is, whether consumption increases with income). To proceed further, we need to make supplementary assumptions about the kind of our observation 4 material. Let us, for example, consider the given observation set as a random sample from a two-dimensional normal distribution with marginal expectations and standard deviations equal to those observed. Perhaps we have additional information that makes such an assumption plausible. With this specification, the testing problem is reduced to examining whether the observed positive correlation coefficient, and hence k, is significantly positive. In order to examine this, we try the following alternative hypothesis: the observation set is a random sample from a two-dimensional normal distribution with marginal expectations and standard deviations equal to those observed, but with correlation coefficient equal to zero. If this alternative hypothesis is accepted, then our initial hypothesis is thereby rejected. On the other hand, if the alternative hypothesis must be rejected, then all hypotheses that the correlation coefficient in the two-dimensional normal distribution is negative must a fortiori be rejected, i.e., the initial hypothesis must be accepted (under the assumptions made). But now I can give a quite precise probability statement about the validity of this alternative hypothesis, since from this hypothesis I am able to calculate the probability for – in a sample of N observations – getting by chance a correlation coefficient at least as large as the one observed. If this probability is for example 0.05, then I know that, on average, in 5 of 100 cases like the actual one, I commit an error by rejecting the alternative hypothesis that is accepting the observed coefficient as significant. When I only specify how certain I want to be, the decision is thus completely determined. If now the observed correlation coefficient passes this test, I can, for example substitute its value into the two-dimensional normal distribution and compute a probabilistic expression for the observed distribution being a random sample from the theoretical distribution thus determined. In this way, I also test the validity of the assumed two-dimensional normal distribution. One sees that by this kind of testing we are exposed to two types of errors: 1. I may reject the hypothesis when it is correct. 2. I may accept the hypothesis when it is wrong, i.e., when another hypothesis is correct. The first type of errors is one I may commit when dismissing a hypothesis that does not seem very likely, but still might be correct. The second type of errors occurs in when I accept a particular one among the possible hypotheses that “survive” the testing process, since one of the others may well be the correct one. What I achieve by performing hypothesis tests like these, is to delimit a field of possible hypotheses. Yet, probabilistic considerations applied to economic data may often be of dubious value, so that we may here choose other criteria. But still the argument becomes similar. It is therefore convenient to take the purely statistical hypothesis testing technique as our point of departure. 4. THE FREE AND THE SYSTEM-BOUND VARIATION. “VISIBLE” AND “INVISIBLE” HYPOTHESES Many hypotheses, including those perhaps we reckon as basic in economic theory, seem to be strongly at odds with the statistical facts. This phenomenon often provides those with a particular interest in criticizing economic theory with welcome “statistical counterevidence”. But there is not necessarily anything paradoxical in such occurrences. Rather, it 5 may be that such seeming contradictions just serve to verify the theoretical hypotheses. We will now examine this phenomenon a bit closer. It relates to matters which are absolutely necessary to keep in mind when attempting statistical verification. We see this problem most clearly when reflecting on how the construction of the hypotheses proceeds: First, we define a specific set of objects – certain economic variables – to be studied. At the start, they move freely in our model world. Then we begin studying the effects of certain imagined variations, and are in this way led towards certain relations which the studied variables should satisfy. Each such relation restricts the freedom of variation in the group of entities we study. If we have n variables and m independent equations (n>m), then only n – m degrees of freedom remain. A person who now takes a glance into our model world, will not detect the free variations on which the formulation of each separate conditioning equation rested, he will see solely the system-bound variation which follows when requiring all conditioning equation to be fulfilled simultaneously. In the demand and supply theory we find simple examples of this. Let us take a market where the incomes are constant and assume that demand and supply (x) is a function only of the price (p) of the commodity considered, that is: (4.1) 𝑥 = 𝑓(𝑝) (the demand curve), (4.2) 𝑥 = 𝑔(𝑝) (the supply curve). (See Figure 3.) If this holds exactly, the only information we get from data for this market will be the point A in Figure 3. This alone cannot give us any information about the shape of the two curves. They are “invisible” hypotheses in this material. On the other hand, if our hypotheses are realistic, then the observed result follows by necessity. In this “pure” case, we run no risk of being misled by data. In practice we should, however, be prepared to find that the demand and the supply relations are not two curves, but rather two bands, as indicated in Figure 4. And then data may lead us astray. Indeed, the data then become the arbitrarily located observation points in the area (a, b, c, d) in Figure 4. If the widths of the two error bands are approximately equal, then we definitely do not know whether what we observe is supply variations or demand variations. If the demand band is narrower than the supply band, we get most knowledge about the form of the demand “function”. Conversely, if the supply band is the narrower, we obtain most knowledge about the supply “function”. Figure 3 6 Figure 4 Now, let us bring variations in income (r) into the demand function, still letting the supply depend on the price only, that is (4.3) 𝑥 = 𝑓(𝑝, 𝑟) (demand). (4.4) 𝑥 = 𝑔(𝑝) (supply). Assume, for simplicity, that (4.3) and (4.4) are two linear equations, i.e., two planes in the (x, p, r)-diagram (see Figure 5). If this holds exactly, then all variation in this market must take place along the line of intersection between the two planes. This line of intersection is the confluent market relation emerging from the structural elations (4.3) and (4.4) holding simultaneously. It is, of course, impossible to determine the slope of the demand plane from statistical data for this market, as there are innumerable planes (and, for that matter, also curved surfaces) which give the same line of intersection. The only visible trace of our hypotheses are three straight lines in, respectively, the (x, p)-, the (x, r)-, and the (p, r) diagram. In our example, the straight line in the (x, p)-diagram is, of course, nothing other than the supply curve (see Figure 6). We know this here because we know the two structural relations (4.3) and (4.4). But if we only had the data to rely on, we might have been led to interpret the observed relation in the (x, p)-diagram as an upward sloping demand curve. If, on the contrary, we had formulated the hypotheses (4.3) and (4.4) a priori, then the observed relation would just have been verification. Just as in the previous example, it will be most realistic to consider the demand and the supply functions, not as exact planes or surfaces, but rather as two surfaces of a certain thickness, in this way allowing for the random variations. The confluent market relation then becomes a “box” extending outwards in the (x, p, r)-diagram, and we get statistical problems similar to those mentioned in connection with Figure 4. 7 Figure 5 Figure 6 This problem of confluence emerges as part of practically all testing tasks. It also occurs in other fields of research, but it occupies a prominent position in economic testing problems because here we are in general – disregarding the possibilities provided by interview investigations – precluded from performing experiments with free variations. Thus, the system-bound observed variation is the only information at our disposal. This is indeed one of the main reasons why refined statistical techniques must be given such a strong emphasis in modern economic research. Using solely the “sledge hammers” among statistical methods will not do; we need the most refined tools among our statisticaltechniques to come to grips with the problems. Above we have seen how we can enter into difficulties when trying out hypotheses, without this being due to restrictive assumptions or lack of realism. But the testing problems, of course, get still more complicated when one is confronted with hypotheses conditioned by substantial “ceteris paribus”-clauses. We shall now take a look at the key issues raised by this. 8 5. THE “CETERIS PARIBUS” CLAUSE AS A STATISTICAL PROBLEM For “ceteris paribus” statements to lead to something more than mere trivialities, we should first and foremost make clear to ourselves which other elements are assumed unchanged. We have no justification for making a “ceteris paribus” statement on matters that we know nothing about at all. The rational application of the “ceteris paribus” clause comes into our hypotheses in two forms. The first is that when formulating a hypothesis as realistic as possible, we start out by trying to specify the elements essential to the problem at hand. Data and practical prior knowledge may guide us in this process. But we are forced to restrict our selection and then we may impose the “ceteris paribus”-clause on all remaining unspecified elements, because the total effect of these other elements has by experience played no large part for the problem at hand and neither can they be expected to do so in the future, so that whether we assume them unchanged or let them play freely is virtually of no consequence. The second way of the “ceteris paribus”-clause is the one we impose within the system of specified variables. The idea here is just the same as that of partial derivatives. We study relations between some among the specified objects, which are mutually independent, subject to the assumption that the remaining elements specified are kept constant. Usually, the form that such relations takes, depends on the level at which the other elements are fixed. Such reasoning is not only of theoretical interest; on the contrary, it is also the basis for assessing effects of practical measures of intervention in the economic activity. Statistical data do not in general satisfy – fortunately, we should say – such “ceteris paribus”-clauses. Thereby we indeed get a tool by means of which we can study the effects of variations in all the entities which are essential to the problem at hand. Once we know these effects, we can possibly eliminate them. The requirement of statistical testability is really the most important – not to say the only – means by which to clarify the nature of our “ceteris paribus” clauses. Among the statistical-techniques regression analysis comes here into the foreground. It occupies, moreover, a key position in all modern econometric research. By means of it, the general principle for an extensive group of test problems can be formulated like this: We attempt to establish regression equations which include, in addition to the variables entering our hypotheses, as many as possible other “irrelevant” variables whose variations systematically influence what is to be “explained”, such that the residual variations become as small as possible and non-systematic. If this is within reach, then the “ceteris paribus” problem is reduced to that of studying the effect of partial variations in the regression equation we have established. But hardly many attempts are needed to becoming convinced that this is far from following a beaten track. First, we are again confronted with the issue of confluent versus structural relations, exemplified earlier. For example, if we have, apart from random errors, a relation like the one illustrated in Figure 5, it would have been completely nonsensical to attempt to determine the slope of the demand plane by regression, including income as a variable in addition to price. That would have given a completely arbitrary result, entirely determined 9 by the errors in the material. Modern regression analysis has, however, measures to safeguard against such fictitious results. We are able to see the confluent relations which the data satisfy. As mentioned above, this is actually a very important way of testing our hypotheses. If they are realistic, they should give just the observed confluent relation as a result of elimination. Only in rare situations we are able to a priori formulate our hypothesis such that this will be the case. The statistical regression analysis thus becomes an important means for adjusting the hypothesis formulation. — The above is, of course, merely a couple of rather superficial remarks to indicate the place of regression analysis as a statisticaltechnical tool in hypothesis testing. If we should, even only casually, touch upon the more technical details, we would right away have to erect a considerable construction of definitions and formulae, for which there is no space here. Secondly, we are confronted with the specific issue of the significance of the statistical results. Often the situation here is as follows: Our hypothesis may, for example, be a relation which the data ought to satisfy. But in a given set of observations the actually realized variations in some variables may have been so small that the errors dominate. From such data we then cannot see what would have taken place in another data set in which variations had been pronounced. In such cases we get nowhere in our attempt to illuminate the effect of our “ceteris paribus”-clauses. If we assume that the variables which have not had significant variations in our data set continue to be of this kind, then we can say that it is inessential to include these variables in our hypothesis to explain what happens. But what is of greatest interest is often indeed to find what would have happened if one made interventions in the system. I can give an illustration of these problems, taken from a study of the demand for pork in Copenhagen, undertaken at the Institute of Economics at Aarhus University this year. (The results conveyed here are only preliminary, and we mention only one of the trials.) One would expect beforehand that many factors influence the consumption of pork: the price of pork, the price of other meat relative to that of pork, the income of the purchasers, the costof-living, and the size and composition of the population would all be among the factors one might a priori take into account. All these elements were included in a regression analysis. The cost-of-living level was used to transform prices and incomes into real values, the size and composition of the population were used to calculate consumption per consumer unit. These transformed variables were used as the explanatory variables. The relationship was assumed linear; nothing in the data suggested a more complicated relationship. (Attempts with logarithmic transformations gave, besides, virtually the same result.) It then turned out that the direct covariation between the consumption (per consumer unit) and the real pork price was the totally dominating feature of our data set. Without elimination of any other factors this relation showed a correlation of about 0.90 – which gave a gross elasticity with respect to the pork price of about –0.8. Inclusion of the other factors had virtually no influence on this correlation. Attempting to explain the residual variation by incorporating these other factors worked out as to be theoretically expected, but their explanatory power was weak. In particular, this was the case for the effect of real income changes. It is impossible to accept this as a general result. If consumers’ purchasing power declined to, say, one half, it would unavoidably exert a decisive effect on the consumption of a commodity like pork. The circumstances which explain the above outcome is partly that the 10 variations in the real income have been small, and partly that they have had a confluent covariation with the pork price (correlation ca. –0.70). If this latter relationship had been very tight, we might equally well have taken income as the explanatory variable instead of the real pork price. However, the income variation alone gave a much less significant explanation of the changes in consumption (correlation only about 0.5). Theoretically as well as practically it would have been of great interest to be able to statistically illustrate the effect of price variations under constant incomes. But in this case, data were hardly sufficient to assess the implication of such a “ceteris paribus”-clause. In treating the testing problems we have thus far tacitly skipped another major problem that usually occurs jointly with it, namely 6. THE SPECIFICATION PROBLEM This is the problem of choosing the quantitative form for the hypothesis of the economic theory. In more general theoretical formulations one often, for the time being, leaves this question open. For instance, one often indicates only that a variable is some function of some other variables, or, a bit more specifically, that a certain variable, for example, increases with a partial increase in another variable. But such general statements often provide only half-way solutions to the problems presented. It is often at least as important to establish exactly how a change works and how large its magnitude is. This is not a question to be asked afterwards, at the stage of applying the theoretical law. In fact, the numerical values of the parameters quite often have importance for the kind of the theoretical conclusions one can draw, as well. We see this clearly when we, for example, study the solution forms of a determinate dynamic system. Changes in the numerical values of the coefficients may well bring the nature of the solution to change, e.g., switch from cyclical movements to trend movements. The final answer is not obtained until a statistical analysis is performed. But this requires that the hypothesis is given a precise quantitative form. Here we should also consult the data, but data alone cannot provide a unique solution to the specification problem. In fact, it has no meaning at all to ask like this: which hypothesis is the best one? Data cannot decide this. But we can formulate a set of alternative hypotheses and choose certain testing criteria to distinguish between them, e.g., decide whether a parabola gives a smaller residual sum of squares than a straight line. What is important is that we, by using data, are able to eliminate a sizable amount of “unreasonable” hypotheses. We indicated general principles for doing this in section 3 above. The choice between different possible hypotheses may sometimes be reduced to a question of mathematical approximation. The final choice of specification may appear to be of lesser importance. But often the choice may also have far-reaching consequences for the conclusions drawn. Suppose, for example, that the issue is to choose between the following two forms of a demand curve (6.1) 𝑥 = 𝑎 · 𝑝 + 𝑏 (a and b constants), 11 (6.2) log⁡(𝑥) = 𝑒 · log⁡(𝑝) + 𝑐𝑏 (e and c constants). In (6.1) the demand elasticity with respect to the price is a function of x and p: 𝑝 𝑝 (6.3) 𝑒(𝑥, 𝑝) = 𝑎 · 𝑥 = 𝑎 · 𝑎·𝑝+𝑏. In (6.2) the demand elasticity it is constant, equal to e. Assume that both hypotheses give practically the same correlation in a given data set. This may well happen. (See, for example, Henry Schultz: Der Sinn der statistischen Nachfragekurven, Bonn 1930, pp. 57 and 69.) If we insert in (6.3) the full sample means of x and p, we usually get practically the same value of the elasticity as the constant e in (6.2). This is distinctly different from inserting the individual observations of p’s and the x’s calculated from (6.1) into (6.3). (Of course, one should not take the observed x-values.) Then we get a more or less strongly varying elasticity. (See for example Wolff: The Demand for Passengers Cars in U.S.A., Econometrica, April 1938, pp. 123–124.) From a theoretical-economic viewpoint the difference between these two results is essential. We here need supplementary theoretical considerations to decide which hypothesis to choose. This arbitrariness is, however, narrowed down by the fact that these various hypotheses should fit into a large number of interconnections with other economic factors. This crisscross testing makes it possible a priori to eliminate a lot of quite impossible hypotheses. Here as in other cases we clearly see how theoretical formulation of hypotheses and statistical testing are not two successive steps, but a simultaneous process in the analysis of economic problems. This is the basic idea of modern econometric research. 7. THE TREND PROBLEM We now proceed to take a look at the problem of trend elimination. This often is perceived as a purely technical-statistical issue, but its nature is really much more profound. We will attempt to examine briefly the logical foundation for trend elimination. In our theoretical formulation of laws, we are always concerned with phenomena of such a nature that they may be assumed to repeat themselves. This applies to both static and dynamic law formulations. Now, the most important economic data are given as time series, a quite specific series of successive events. Is it indeed possible to test laws for recurrent phenomena on the basis of such time-bound variations? To be able to say anything about this question we must study the characteristic path of the observed time series. For economic time series there are usually two features that catch our attention: one is a steady straight development, the trend path, the other is certain variations around the trend movement. We often can trace the trend back to certain more sluggishly moving factors (notably changes in the size and composition of the populations); factors that are outside the circuit of variables included in our hypotheses and working independently of the variations we want to study. In such cases it is natural to take a trend as a datum in the 12 analysis, and consider what happens apart from the trend variation. This is the rational basis for a statistical elimination of the trend in our observation series. It is unacceptable to undertake a purely mechanical trend elimination without being able to give a concrete interpretation of the emergence of the trend. It might very well be that an observed trend gets its natural explanation through the relations and the set of variables that are included in our hypotheses. We shall take a closer look at this. A realistic system of hypotheses will comprise static as well as dynamic structural relations. The formulation of the various structural relations are founded on certain imagined alternative variations, partly in the variables themselves at a given point in time, and partly in growth rates and “lag” terms. Assume that our efforts lead to a determinate dynamic system, allowing us to solve it, i.e., finding the time path of the variables studied. It may then well happen that the observed trend movements are just the possible solutions of this system. In other words, the trend movement may emerge as a confluent form of the dynamic system of structural equations. The observed trend movements can then often be taken to be a statistical verification of our system of hypotheses. If we eliminate mechanically the trend in advance like one or another time function (e.g., a straight line or an exponential function), then we have first, prevented ourselves from realizing that our system of hypotheses may give a plausible explanation of the trend movement. Further, we may have prevented ourselves from undertaking a statistical testing and estimation of coefficients of certain structural relations where this would otherwise have been possible, as it may happen that for some variables, the trend is the only significant element, while the other variations are disguised by random “errors”. And, finally, we have confined attention to a particular trend path independent of our structural relations, so that we cannot uncover the way in which changes in the structure would influence the trend. This latter point can really be of crucial interest in assessing regulatory measures. When our testing data constitute series with pronounced trend movements, then it might be asserted that the hypotheses we verify are not laws for recurrent phenomena, but only a description of a historic development. If this view were to be accepted in general, it would have been a heavy blow for the strivings to establish economic laws. But we do not need to restrict ourselves to such a negative position, as is already apparent from our remarks on the trend problem. Either the trend has its origins lying outside our system of hypotheses. And if we specify these causes, we are entitled to eliminate the trend in advance, such that we only consider the residual variation as having the character of recurrent cases. Or, the trend is rooted in the structure of the system under consideration, an outcome of an analysis of free variations being explained through the same system of hypotheses as the one which leads to the variations of recurrent character. (To examine whether the latter holds true, it may be of interest to try out the same hypothesis on detrended data.) There is really no reason why trended data could not at the same time be conceived as recurrent phenomena, it is only a question of what we are considering, whether the variables themselves or their growth rates. As soon as a growth rate varies around a non-zero average, then the corresponding variable will itself get trended. For example, let W be the stock of real capital at a certain time and w and u investment and depreciation per unit of time, respectively. Regardless of our hypothesis for the form of the relation between w and 13 u, provided it includes a condition that w on average should exceed u, and then W will follow an increasing trend (we assume w and u positive). Then we just have (7.1) Ẇ(𝑡) = 𝑤(𝑡) − 𝑢(𝑡) = a⁡positive⁡variable⁡on⁡average. Here the different situations with respect to Ẇ(𝑡) (the growth rate of W) are the elements which can repeat themselves, W itself going towards gradually new positions. And this is what is implicitly expressed in our hypothesis. 8. AVERAGE VERSUS MOMENTARY EXPLANATIONS We mentioned above how the choice between different possible hypotheses cannot be done unconditionally, we must establish certain criteria for how to proceed. The nature of these criteria cannot be the same in all cases. Let us stick to hypotheses that are given as an equation between certain economic variables. By transforming the variables, constructing new terms, etc., we will in general be able to express the relation in linear form so that we can use linear regression analysis as a testing tool. We will be inclined to accept such a relationship if the statistical agreement between the observed data and those that can be computed from the regression equation is good. This question is often mixed up with the question of whether the regression determined coefficients in the hypothesis are significant or not (i.e., whether the absolute value of the coefficients are large relative to their standard errors). These, however, are two different issues, the first being about the error in the momentary explanation, the second about the error of an average relation. (This, by the way, should not be mixed up with fact that each observation entering the analysis can be averages for larger groups of units, as mentioned above when commenting on the statistical nature of economic laws). Let us take an example to illustrate this difference between average explanation and momentary explanation.1 Consider N observations on three variables, x, y, z, which are exactly related through the equation (8.1) y = 2𝑥 − 3𝑧. For simplicity, we assume that the three variables are measured from their respective means, and that they have finite standard deviations 𝜎𝑥 , 𝜎𝑦 , 𝜎𝑧 ,⁡which together with the correlation coefficients 𝑟𝑥𝑦 , 𝑟𝑥𝑧 , 𝑟𝑦𝑧 ,⁡approach certain fixed numbers when the number of Translator’s note: The terms ‘average explanation’ and ‘momentary explanation’ has been translated rather directly from terms Haavelmo coined in Norwegian for this paper (‘gjennomsnittsforklaring’, ‘momentanforklaring’), and probaly never used again. Neither do this concepts seem to have been given much attention in later econometric literature – although the subject matter is highly important. The suggested translations are thus offered for lack of better terms. 1 14 observations becomes large. Suppose that we do not know the exact relation (8.1), but believe that there exists a proportionality relationship between y and x only, (8.2) y = 𝑏𝑥. The correlation coefficient between these two variables becomes (8.3) 𝑟𝑥𝑦 = ∑𝑁 𝑖=1 𝑥𝑖 𝑦𝑖 𝑁∙𝜎𝑥 𝜎𝑦 = ∑𝑁 𝑖=1 𝑥𝑖 (2𝑥𝑖 −3𝑧𝑖 ) 𝑁∙𝜎𝑥 𝜎𝑦 . After rearrangement we get 𝜎 𝜎 (8.4) 𝑟𝑥𝑦 = 2 ∙ 𝜎𝑥 − 3 ∙ 𝑟𝑥𝑧 ∙ 𝜎𝑧 . 𝑦 𝑦 Thus, from our above assumptions, 𝑟𝑥𝑦 will converge to a fixed number when N increases. Now, the elementary regression of y on x is defined by (8.5) 𝑦 𝜎𝑦 𝑥 = 𝑟𝑥𝑦 𝜎 . 𝑥 Inserting (8.4) in (8.5) we get, as our expression for (8.2), 𝜎 (8.6) 𝑦 𝑐𝑎𝑙𝑐𝑢𝑙𝑎𝑡𝑒𝑑 = (2 − 3𝑟𝑥𝑧 ∙ 𝜎𝑧 ) ∙ 𝑥. 𝑥 Now, it is well known that the standard error of the regression coefficient b equals (8.7) 𝜎𝑏 = 1 𝜎𝑦 √𝑁−2 𝜎𝑥 2 , √1 − 𝑟𝑥𝑦 and this entity becomes larger the smaller N is. In other words, the regression between y and x becomes more precisely determined the larger is the number of observations at our disposal. Thus, there exists an average relation between y and x per group of N observations, which is more stable the larger is N. But this does not necessarily mean that (8.6) gives a good agreement between observed and calculated values of y, i.e., a good description of the 15 momentary variation in y. Consider now the mean squared difference between y in (8.1) and 𝑦 𝑐𝑎𝑙𝑐𝑢𝑙𝑎𝑡𝑒𝑑 in (8.6). It becomes (8.8) 1 𝑁 1 𝜎 𝑧 𝑐𝑎𝑙𝑐𝑢𝑙𝑎𝑡𝑒𝑑 2 2 2 2 ∑𝑁 ) = 𝑁 ∑𝑁 𝑖=1(𝑦𝑖 − 𝑦𝑖 𝑖=1(2𝑥𝑖 − 3𝑧𝑖 − 2𝑥𝑖 + 3𝑟𝑥𝑧 ∙ 𝜎 ∙ 𝑥𝑖 ) = 9(1 − 𝑟𝑥𝑧 )𝜎𝑧 . 𝑥 We see that irrespective of the number of observations, the same unexplained variance is 2 left in y, expressed by 9(1 − 𝑟𝑥𝑧 )𝜎𝑧2 at the right hand side. When the number of observations increases, the error in the average explanation is more and more reduced, while the error in the momentary explanation remains at the same level as long as we in our hypothesis do not include new variables (here z), which can explain more or less of the residual dispersion. One sees from (8.8) that the momentary explanation of y by using (8.6) becomes better the smaller is the variation in z and the larger is the correlation (𝑟𝑥𝑧 ) between the latter variation and that of x as included in (8.6). If the correlation 𝑟𝑥𝑧 is high, this obviously means that we, by accounting for the variation in x, also have accounted for part of the variation in z. If now 𝑟𝑥𝑧 is small while z displays very little variation, this means that z is a superfluous variable when it comes to explaining the observed variation of y in this material. But that does not necessarily mean that (8.6) will give us a good forecast of y outside the period covered by data. Because from (8.1) we just see that z will exert a substantial impact if it should happen to vary markedly stronger than it does in our material. Hence, although x alone may give a very good explanation of y in our material, it may be of decisive importance whether or not we can utilize the tiny part of the variation remaining in attempting to capture the effect of z (i.e., the coefficient value 3 in (8.1)). If now x and z are uncorrelated, then, even if z were to vary strongly, still the average explanation of y by x would be the same, that is 𝑦 = 2𝑥. This is seen from (8.6). But the transitory explanation would be much poorer, as is seen from (8.8). If there now, in addition to the stronger variations in z, also is a certain correlation between x and z, then we will get more or less distorted information about x’s effect on y by sticking to relation (8.6), even if this relation, as an average explanation, has full explanatory power from a purely statistical point of view. For while x, as is seen from (8.1), in reality affects y by twice its own value, this is not the case in (8.6) as long as 𝑟𝑥𝑧 differs from zero. If 𝑟𝑥𝑧 is larger than zero, the coefficient of x may even become negative. This is just a result we can get if we impose a “ceteris paribus”-clause on z without knowing its effect, that is, without knowing the fundamental relation (8.1). 16 This indeed shows how crucial it is to have, in advance, a formulation of the hypotheses, in which one operates with specific fictitious variations. If one refrains from doing so, one runs the risk of missing out important variables that for some reason have not shown significant variation in the material at hand. And although a simpler hypothesis may give a stable average explanation and for that reason gets accepted as statistically valid, it may well give a very poor, maybe a completely worthless momentary explanation and no deeper insight into structural relationships between the studied variables. 17

English translation.

Related documents

Products

Support

English translation.

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib