1. Introduction Food versus fuel is America’s newest battle. Lying on the front is the question of whether America can increase annual corn based ethanol production from 10.75 billion gallons today to 15 billion gallons in 2015 (Renewable Fuels Association, 2010). The Renewable Fuel Standard (RFS) program was originally created by the Energy Policy Act of 2005 and expanded under the Energy Independence and Security Act (EISA) of 2007. It recently underwent statutory revisions in February 2010 mandating that 15 billion gallons of corn based ethanol must be blended into gasoline annually by 2015 (U.S. Environmental Protection Agency, 2010). This represents a long history of U.S. state and federal ethanol policies. Although ethanol is used as a transportation fuel today, it was originally used as an illuminating oil prior to the Civil War. A federal tax of $2 per gallon imposed in 1862 to fund war efforts made ethanol’s use cost prohibitive. Even though the tax was removed in 1906, it wasn’t until the Energy Tax Act of 1978 that America’s modern ethanol industry was born (U.S. Department of Energy, 2008). 1 Today, domestic ethanol production is encouraged through a combination of state and federal policies. Primary federal ethanol policies consist of: RFS ethanol use mandate, Winter Oxygenated Fuels Program, Small Ethanol Producer Tax Credit of $.1 per gallon for plants producing less than or equal to 60 million gallons of ethanol per year, $.45 per gallon ethanol blender’s credit (RFA, 2010), and a $.54 per gallon tariff applying to imported ethanol with the exception of the duty free importation of 240 million gallons of ethanol annually under the Caribbean Basin Initiative (Zhang, 2007). A time line of major policy events in U.S. ethanol production is presented in the graph titled “U.S. Ethanol Supply and Demand” on page 1. The graph links policy changes to levels of U.S ethanol production, consumption and imports. In addition to federal ethanol policies, there are 106 different state laws affecting ethanol production and marketing. These state policies fall in one of five broad categories: producer incentive programs (i.e. preferential tax treatment)/grant funds, retailer/infrastructure incentives for ethanol blends and E-85, state use mandates, retail pump label requirements, and state fleet fuel purchase/use requirements. Simulations run by the Food and Ag Policy Research Institute (FAPRI, 2008) suggest state policies have a minor impact on ethanol demand today, shifting the U.S. ethanol demand curve outward by no more than 10% . However, it was state level bans on methyl tertiary butyl ether (MTBE)1 between 2002 and 2007 (Low and Isserman, 2009) which enabled the U.S. ethanol industry to double its annual capacity from 2.3 billion gallons in 2002 (61 plants) to 5.5 billion gallons (110 plants) in 2007 according to the Renewable Fuels Association (2010). This growth lessened in 2008 as a result of a recession and poor ethanol plant profitability caused by commodity markets collapsing. These conditions sparked a round of bankruptcies and plant shut downs starting in 2008 that found 36 plants (Johnson, 2009) or 21% of total U.S. ethanol capacity idle by spring 2009 (RFA, 2010). Although the majority of idle capacity is presently back in production and 11 new ethanol plants are currently being built, the wide spread effects of ethanol warrants an examination of U.S. ethanol policy and production. The objective of this paper is to answer the question, what impact does government policies have on U.S. ethanol production? 1 MTBE is a chemical compound produced by the reaction of methanol and isobutylene. It is used as a fuel additive in gasoline. As part of a family of compounds called oxygenates, MTBE raises oxygen content of gasoline (U.S. Environmental Protection Agency, 2008). 2 This answer to this question is of interest to policy makers and industry organizations alike. The puzzle in the literature this paper specifically tries to address is what is the effect of states banning the fuel additive MTBE if the Reformulated Gasoline (RFG)2 program is in place? To fill this whole in the literature, 2SLS is used where the quantity of ethanol and price of ethanol are treated as endogenous. It is found that increasing the percentage of the total population living in states with MTBE bans by 10% will always lead to a .255% increase in the quantity of ethanol demanded regardless of whether or not the RFG program is in place. This estimate should be interpreted with caution. Although the system of equations using annual national level data did not suffer from autocorrelation and had exogenous instruments to resolve the endogeneity of ethanol price in the supply and demand equation, the instruments used to identify the demand curve were not relevant and therefore MTBE was the only government policy variable in the demand equation to be statistically significant. Only one government policy variable, the federal blender’s tax credit was included in the supply equation and it was statistically significant. A 1% increase in the federal blender’s credit results in a 2.4% increase in the quantity of ethanol produced. 2. Literature Review Controversy surrounding ethanol and the broader renewable fuels movement has enabled economists, political scientists, environmental scientists and engineers to secure extensive grants to study the ethanol industry. This grant funding has incentivized many scholars to focus their research on specific ethanol policy or program effects. In response, little work has been done to estimate U.S. ethanol supply and demand curves. The work that has been done to estimate U.S. ethanol supply and demand curves using 2SLS was pioneered by Kevin Rask. Rask used state level monthly data from 1984 to 1993 to develop an ethanol supply and demand model and to calculate elasticities (1998). In similar spirit, Luchansky and Monks used monthly national level data from 1997 to 2006 to estimate U.S. ethanol supply and demand curves. Luchansky and Monk also calculated elasticities to measure the response of ethanol production to ethanol, gasoline, MTBE and corn price changes. Problems using endogenous variables were resolved using 2SLS. It was found 2 RFG program lasted from 1995 to 2006. It was a fuel oxygenate requirement the EPA imposed upon cities with poor air quality. 3 that when the corn price was treated as endogenous it had a positive coefficient implying that an increase in corn prices leads to an increase in the equilibrium quantity of ethanol. In contrast, when a corn price instrumental variable was used instead of the corn price itself the coefficient on the corn price instrumental variable had the correct sign (Luchansky and Monks 2009). This paper expands the work of Luchansky, Monks and Rask by using annual national level data from 1982 to 2008. In addition, Luchansky and Monks’ supply curve specification does not account for the value of ethanol byproducts used for livestock feed. Luchansky and Monks include corn oil as a co-product price, but most ethanol plants are dry-grind plants and don’t have the technology to capture corn oil from the corn kernel. In response, this paper includes the soybean meal price in the supply equation as a proxy for the value of ethanol plant byproducts used as livestock feed. In addition to variable selection, it is also important to consider this paper’s econometric methods. 2SLS is commonly used in commodity markets to estimate supply and demand curves, although it is difficult to correctly specify the structural equation for some markets. For example, C.Y. Cynthia Lin in her paper Estimating Annual and Monthly Supply and Demand for World Oil: A Dry Whole? used 2SLS to estimate aggregate supply and demand curves for world oil (2004). Lin’s use of instrumental variables did not yield coefficients of the expected sign for OPEC demand and many of the parameters in the supply equations. This indicates that Lin’s econometric specifications or economic theory didn’t accurately reflect the complexities of the world oil market (2004). These results suggest that correctly specifying aggregate supply and demand models is difficult and initial attempts to estimate aggregate supply and demand models with 2SLS commonly yields coefficients whose signs contradict economic theory. Unlike previous works, this paper uses annual data to avoid autocorrelation associated with monthly data. In addition this paper uses national data, but creates an instrument to account for state level changes in MTBE bans over time. 3. Economic Theory U.S. ethanol price and quantity is determined by the intersection of supply and demand curves. The primary determinant of the ethanol supply curve is corn price. According to the USDA’s January Feed Outlook Report, 32.1% of the U.S. corn crop for the 2009/2010 marketing 4 year, is used for ethanol while the remainder is used for non-ethanol food, seed and industrial use (9.7%), livestock feed and residual use (42.5%), and exports (15.7%), thus, any changes in ethanol policies or production influences corn prices (2010). This means that corn price can not be directly used in the supply equation. The corn price must either be proxied for using a lagged corn price or be instrumented for using 2SLS. Ethanol production’s impact on input markets is likely heightened by government policies aimed to directly or indirectly increase ethanol demand. Since MTBE was banned in over twenty states from 2000 to 2007 while the RFG program mandating use of either MTBE or ethanol to make gasoline burn cleaner was in place from 1995 to 2005, economic theory suggests that an MTBE ban will increase ethanol demand by a larger amount when the RFG program is in place than when the RFG program is not in place. 4. Data Ideally state level panel data would be available for the price and quantity of ethanol produced and consumed in the U.S. since 1982. Unfortunately, the U.S. Department of Energy stopped maintaining its state level ethanol series in May 1993. Currently, the U.S. Department of Energy has annual state level data and monthly national level data on the quantity of ethanol consumed and produced. Ideally, state level historical ethanol prices would be available. In order to estimate a system of equations using state level panel data, the quantity of ethanol shipped in and out of each state has to be known. The U.S. Department of Energy (2009) currently tracks ethanol shipments, but only tracks the quantity of ethanol shipped and imported by region, not state and the data series was just started in 2009. It would also be ideal to have a database on all federal and state level ethanol subsidies so a variable of total subsidy per gallon of ethanol could be computed. In the presence of data constraints, this analysis uses U.S. national annual data from 1982 to 2008. National level data for the quantity of ethanol produced, quantity of ethanol consumed, level of federal blender’s ethanol credit, well head natural gas prices and city average retail prices for all types of gasoline including taxes came from the United State’s Department of Energy’s Energy Information Administration (2009). Ethanol price data were received from Nebraska’s Energy Office (2009). Average annual corn and soybean meal prices were computed using the Chicago Board of Trade’s daily closing prices for the nearby contract. This data was 5 received from Montana State University’s Department of Agricultural Economics and Economics (2009). All of the price data above were converted to real terms using annual GDP implicit price deflator values with 2008 as base year. The GDP implicit price deflator and the U.S. average per capita income were retrieved from the U.S. Bureau of Economic Analysis (2009). State and national level population data was received from the U.S. Census Bureau’s 2010 Statistical Abstract(2010). Data on the number of licensed driver in the U.S. was received from the U.S. Department of Transportation (2009). In addition, the data on which states implemented MTBE bans into law were provided by the United State Environmental Protection Agency (2007). All additional information concerning state and federal ethanol programs were obtained from the Renewable Fuels Association (2010). Table 1. Summary Statistics For U.S. Ethanol Market, 1982 to 2008 Mean Std. Dev. Min Max Endogenous Variables Ethanol production, per U.S. licensed driver(gallons / driver) 8.55 8.23 1.25 37.84 Ethanol consumption, per U.S. licensed driver(gallons / driver) 8.78 8.74 1.25 39.36 Real Ethanol Price ($/gallon) 1.98 0.56 1.22 3.35 Real Corn Price ($/bushel) 3.64 1.01 2.29 5.98 Real corn price lagged one year ($ / bushel) 3.65 1.03 2.29 5.85 Real natural gas price ($ / thousand cubic) 4.07 1.00 2.06 8.07 263.20 1.78 172.90 381.60 0.75 0.17 0.51 1.09 Real Unleaded Gasoline Price ($/gallon) 1.96 0.48 1.41 3.32 MTBE instrumental variable 0.10 0.19 0.00 0.51 Winter Oxygenate Program 0/1 Dummy 0.63 0.49 0.00 1.00 Reformulated Gasoline Program 0/1 Dummy 0.41 0.50 0.00 1.00 32.81 5327.73 23327.00 40222.00 Exogenous Variables Exclusive to Supply Equation Real soybean meal price ($ / ton) Real federal blender's credit ($/ gallon) Exogenous Variables Exclusive to Demand Equation U.S. average per capita income N = 27 annual observations 6 Table 1 summarizes all key variables used in the study. There is a large difference between the maximum and the minimum quantity of ethanol produced per licensed U.S. driver. The quantity of ethanol produced per licensed U.S. driver was 1.25 gallons in 1982 and steadily increased to 37.84 gallons in 2008. It is also worth noting that the mean for the instrumental variable MTBE is only .1 because the first state level MTBE ban did not occur until 2000. 5. Theoretical Framework Testing the affects that state and federal ethanol policies have on U.S. ethanol production would ideally be done implementing each policy one at a time for an extended period of time to quantify the short and long-run impacts each policy has when they are the sole ethanol policy used. Then a series of long-run experiments would be conducted by implementing different ethanol policies such as a blenders credit and tariff on imports simultaneously to quantify the interaction effects of different government ethanol policies. Over the period in which these experiments are conducted, all variables such as population and per-capita income have to be held constant so that the study’s results aren’t confounded by changes in variables besides the ethanol policies themselves. Unfortunately, Figure 1 shows that empirically estimating the impact of government policies on U.S. ethanol production is not easy, because there is not a clear systematic relationship between the quantity of ethanol demanded and the price of ethanol. This occurs because of the large number of governmental policies affecting U.S. ethanol industry simultaneously and also structural changes that have occurred in demand, such as consumers preferences for vehicles able to burn higher amounts of ethanol and also structural changes in ethanol supply such as improvements in ethanol processing technology. 7 1.5 2.0 Price ($ / gallon) 2.5 3.0 Figure1. Relationship Between U.S. Ethanol Price and Quantity Demanded, 1982 - 2008 0 10 20 30 40 Quantity Demanded (gallons / licensed U.S. river) Further evidence of structural changes in the U.S. ethanol industry is suggested by pairwise plots in figure 3. The pair wise plots allow the reader to compare 5 different variables. The obvious 4 variables of quantity of ethanol demanded per licensed U.S. driver, ethanol price, gasoline price and corn price are labeled on the graph. However the fifth variable in the pariwise plots is actually time. The colors code four different time periods as follows: Black = 1982 to 1990 Red = 1991 to 2000 Green = 2001 to 2004 Blue = 2005 to 2008 These twelve plots show there are non-linear time trends that need to be entered in either the supply equation, demand equation or both. A classic example of this non-linearity is the relationship between the quantity of ethanol demanded and the price of ethanol as seen in the plot located in the second row of the first column. Another plot suggesting there has been 8 structural adjustments in both input and ouput markets for ethanol over time is the plot in the fourth row of the first column. This plot shows the relationship between the quantity of ethanol demanded and the price of corn. It shows that in the early years of ethanol production when ethanol consumed a small percentage of total corn production an increase in the quantity of ethanol demanded led to a decrease in corn price, but in recent years as ethanol production has grown to use 1/3 of the total U.S. corn crop, an increase in the quantity of ethanol demanded has bid up the price of its largest input, corn. It is also important to note that with the system of equations where the quantity supplied and demanded are simultaneously determined, the quantity of ethanol demanded equals the quantity of ethanol supplied. This is why only the quantity of ethanol demanded was included in the pair wise plots. In other words, these plots suggest a nonlinear time trend may need to be added to the supply curve, the demand curve, or both. Figure 3. Pairwise Plots for U.S. Ethanol Market, 1982 - 2008 2.0 2.5 3.0 3 4 5 6 30 40 1.5 2.5 3.0 0 10 20 qd 2.5 3.0 1.5 2.0 pe 5 6 1.5 2.0 pg 3 4 pc 0 10 20 30 40 1.5 2.0 2.5 3.0 9 U.S. ethanol supply and demand equations can be specified with the following functional forms: SUPPLY: π΅ π΅ π΅ π΅ π΅ 2 ππ‘π = πΌπππ‘ 1 πππ‘−1 ππππ‘ 3 ππ ππ‘ 4 π΅πΆπ‘ 5 π π΅6 ππΈπΆπ»π‘+π΅7 ππΈπΆπ»πππ‘ ππ‘ (1) Where: πΌ = π π΅π Taking the logs of both sides yields the log-log model: ln( ππ‘π ) = π΅π + π΅1 ln(ππ ) + π΅2 ln(πππ‘−1 ) + π΅3 ln(ππππ‘ ) + π΅4 ln(ππ ππ‘ ) + B5 ln(BCt ) + π΅6 (ππΈπΆπ»π‘ ) + π΅7 (ππΈπΆπ»πππ‘ ) + πt (2) DEMAND: π΅ π΅ π΅ ππ‘π = πΌπππ‘ 1 πππ‘ 2 ππΆπΌπ‘ 3 π π΅4 π πΉπΊπ‘ +π΅5 ππ·π‘+π΅6 πππ΅πΈπ‘+π΅7 (πππ΅πΈπ‘∗π πΉπΊπ‘ ) ππ‘ (3) Taking the logs of both sides yields the log-log model: ln(ππ‘π ) = π΅π + π΅1 ln(πππ‘ ) + π΅2 ln(πππ‘ ) + π΅3 ln(ππΆπΌπ‘ ) + π΅4 (RFGt ) + π΅5 (ππ·π‘ ) + π΅6 (πππ΅πΈπ‘ ) + π΅7 (πππ΅πΈπ‘ ∗ π πΉπΊπ‘ ) + ππ‘ (4) Where: Pe = annual ethanol rack price F.O.B. Omaha, NE ($ / gallon) Pct-1 = average annual Chicago Board of Trade corn price lagged one year ($/bushel) Qs = annual quantity of U.S. ethanol produced (gallons / licensed U.S. driver) Qd = annual quantity (gallons) of U.S. ethanol produced / licensed U.S. driver Pc = average annual Chicago Board of Trade corn price ($/bushel) Png = annual well head natural gas price, U.S. Department of Energy ($ / thousand cubic feet) Pg = U.S. city average retail price including taxes for all types of gasoline Psm = average annual Chicago Board of Trade soybean meal price ($ / ton) BC = Federal Blender’s Credit ($/gallon) TECH= linear time trend technology proxy (i.e. 1 for 1982 through 27 for 2008) TECHSQ = quadratic time trend technology proxy MTBE = MTBE instrumental variable (see below) WD = 0/1 dummy variable for Winter Fuel Oxygenate Program (1=program in place) RFG= 0/1 dummy variable for Reformulated Gasoline Program (1= program in place) PCI= U.S. average per capita income Note: All prices are inflation adjusted using the GDP implicit price deflator with 2008 as base year. 10 To run a regression on the supply equation (1) and demand equation (3) the natural log must be taken on both sides of the equations to transform the equations to linear forms on which regressions can be run. The double log model is useful because the regression coefficients provide direct estimates of the elasticities. U.S. aggregate supply and demand curves are estimated using 2SLS to remove endogeneity of the ethanol price in both the supply and demand equation. Identification of the supply and demand equation requires the use of “shifters,” that is variables only in the demand equation that enable the supply equation to be identified and vice-versa. The fitted value of ethanol price is calculated in the first stage of 2SLS by regressing the corn and ethanol price on all the exogenous variables from both the supply and demand equation. The fitted ethanol prices for the supply and demand equations are estimated by equations (6) and (8) respectively. After the endogenous variables are estimated in the first stage of 2SLS the estimated endogenous variables are then plugged into the second stage of their respective supply (7) or demand (9) equations. The variables exogenous in the supply and demand equations are: Png, Pct-1, Pg, Psm, PCI, BC, RFG, WD, MTBE, TECH and TECHSQ. Png, Pct-1, Psm, BC, TECH and TECHSQ are all found in the supply curve and referred to as “supply shifters”. In contrast, Pg, PCI, RFG, WD and MTBE are all found in the demand equation and referred to as “demand shifters.” The quantity and price of ethanol are endogenous. Normally a Hausman Test would be used to verify whether or not an explanatory variable is endogenous. However, in the case of supply and demand equations there is strong reverse causation between quantity and own price, therefore price of ethanol can be treated as endogenous without conducting a Hausman Test. Initially, the corn price is treated as endogenous to the model as shown in Regression (4) in Table 2. Granted, the magnitude and signs on coefficients in regression (2) where the lagged corn price is used and treated as exogenous and regression (4) where the actual corn price which is endogenous is used, however the ethanol price in the demand equation is only statistically significant when the lagged, not the actual corn price is used. Therefore the actual corn price is not used, but the corn price lagged by one year is used instead. Using PCt-1 to proxy for the U.S. corn price may appear to introduce yet another endogenous variable. Granted, PCt-1 is correlated with the endogenous variable U.S. corn price because it is unlikely that major adjustments to global corn supplies occur in one year and are part of longterm trends. Figure 2 shows that the actual corn price and corn price lagged one year have a 11 positive linear relationship; therefore lagged corn price is a good proxy for actual corn price, assuming the lagged corn price is uncorrelated with the error term. 4.5 4.0 3.5 2.5 3.0 Lagged Corn Price ($/bushel) 5.0 5.5 Figure2. Relationship Between Corn Price and Corn Price Lagged One Year, 1982 - 2008 3 4 5 6 Corn Price ($/bushel) At first, PCt-1 may appear to be just another “corn price” and hence correlated with the error term in the second stage of 2SLS. It is important to note, however, that PC t-1 is a predetermined variable in the 2SLS model. According to William H. Greene in his book Econometric Analysis, PCt-1 is a pre-determined variable because PCt is independent of all subsequent structural disturbances. Since PCt-1 can be treated as if it were exogenous, consistent estimates can be achieved when PCt-1 is used as an independent variable to estimate the corn price instrumental variable (Greene, 2003). A corn price determined last year is exogenous to this year’s error term. In response, the lagged price of corn satisfies the requirements of an instrumental variable. It should be noted that lagging the corn price reduces, but does not completely eliminate the problem of corn price’s endogeneity. However, tests discussed in the results section conclude lagged corn price is a valid instrument. Granted, it can be argued that PCt-1 is only chronologically predetermined and 12 that PCt-1 is not actually exogenous since there are federal crop and ethanol programs that effect both today and last month’s corn price, but PCt-1 is predetermined in the sense of trying to figure out short run shocks. For example, news this year regarding a Chinese crop failure cannot effect last year’s corn price. Another variable in the system of equations deserving mention is the soybean meal price found in the ethanol supply equation. While not a direct cost, Psm, the Chicago Board of Trade’s soybean meal price in dollars per ton is also included in the supply equation. Soybean meal prices serve as a proxy for the value of high protein feedstuffs used in livestock rations. Byproducts, called distillers grains, produced by ethanol plants are primarily valued for their protein content. In response, soybean meal is used as a proxy for the value of an ethanol plant’s byproducts. Until recently, ethanol byproduct markets were highly local with prices not reported by the USDA. As a result, soybean meal is used as a proxy for distillers grain values even though soybean meal suffers from measurement error. Soybean meal is 12% moisture and can be stored for long time periods. In contrast, wet distillers grains are 70% moisture and can’t be stored for more than 2 weeks. This means ethanol plants have to sell wet distillers grains for less than its protein value indicates during periods of low demand such as the summer months or when an ethanol plant’s dryers shut down and all distillers grains must be sold wet. Clearly there will be measurement error associated with the use of soybean meal as a proxy for distiller’s grain values. Granted, the measurement error increases the error variance, but the measurement error does not affect any of the ordinary least squares properties. Despite the presence of measurement error, the U.S. ethanol supply equation still satisfies the first four Gauss-Markov Assumptions, meaning the U.S. ethanol supply equation still has unbiased and consistent estimators. Measurement error (et) for year t created by the use of soybean meal prices as a proxy for distillers grain prices is: ππ‘ = πππ,π‘ − ππ π,π‘ (5) where Pdg,t is the distillers grain price and Psm,t is the soybean meal price for year t. It’s important to note that the classic errors-in-variables assumption is not invoked since the measurement error is correlated with the unobservable variable distillers grain price but not with 13 the observed variable soybean meal price. Distillers grain price is correlated with the measurement error since it fluctuates frequently based on local supply and demand conditions. The soybean meal price is included in the U.S. ethanol supply equation instead of the U.S. ethanol demand equation because the price an ethanol plant receives for its byproducts has no bearing on the quantity of ethanol demanded, but impacts the quantity of ethanol supplied. For every 1 bushel of corn entering an ethanol plant for processing, 1/3 of a bushel of corn leaves the ethanol plant as byproduct. Fluctuations in byproduct values significantly affect an ethanol plant’s net corn cost and are thus included in the U.S. ethanol supply equation3. Also included in the supply equation is the natural gas price. Png is the Department of Energy’s (2009) monthly well head natural gas price. Natural gas prices in dollars per thousand cubic feet are included in the supply equation since natural gas is the second largest input cost ethanol plants incur, with corn being the largest. The federal blender’s credit is included in the supply and not the demand equation, because it actually subsidizes the production of ethanol fuels. The federal blenders credit pays fuel blenders $.45 per gallon to blend ethanol with gasoline. This lowers the cost paid by consumers for ethanol blended fuels which is why it is included in the supply, and not the demand equation. As mentioned earlier, since this system of equations uses prices and quantities simultaneously determined by both supply demand, we must set Qd = Qs. In reality Qd > Qs since the U.S. rarely exports any ethanol, but consistently imports a small quantity. The quantity of ethanol imported into the U.S. is so trivial it was not even reported by the U.S. Department of Energy until 1993. Prior to 2003, imports never accounted for more than 1% of total U.S. ethanol consumption and since 2003 imports peaked out at 6% of total U.S. ethanol consumption in 2007 (U.S. Dept. of Energy, 2009). For purposes of this study we only use Qd and set Qd = Qs and assume no imports. Note that Qd is the annual quantity in gallons of U.S. ethanol demanded per licensed U.S. driver. Using the quantity of ethanol demanded per licensed U.S. driver as the dependent variable controls for changes in the driving age population over time. From 1982 to 2008 the number of licensed drivers in the United States increased from 150 to 205 million or 36.6% (Transportation 2009). By controlling for population changes, Qd can be interpreted as how much does the average person who drives demand ethanol. Pe is the monthly ethanol rack 3 Net corn cost describes the cost of starch used to produce ethanol. Net corn cost equals the whole kernel corn cost– byproduct value. 14 price in dollars per gallon F.O.B. Omaha, Nebraska. Pg is the U.S. city average retail price including taxes for all types of gasoline. The literature fails to conclude whether the substitution or the compliment effect dominates in the relationship between gasoline and ethanol. A correlation of .67 between the price of ethanol and the price of gasoline indicates the two are strongly correlated and that gasoline is a relevant instrument for ethanol price and should be included in the supply curve. RFG is a 0/1 dummy which is 1 for years 1995 to 2006 when the Reformulated Gasoline Program was in effect and 0 otherwise. To be able to see the marginal effect that banning MTBE has on the quantity of ethanol produced when an RFG program is in place, an interaction term between MTBE and RFG is included. In regression (2), when the interaction term was not included RFG kept coming in as negative number and not statistically significant. 5.1 U.S. Ethanol Supply Equation 1st Stage of 2SLS ln(πππ‘ ) = ππ + π1 ln(πππ‘ ) + π2 ln(ππΆπΌπ‘ ) + π3 π πΉπΊπ‘ + π4 ππ·π‘ + π5 πππ΅πΈπ‘ + π6 ln(ππππ‘ ) + π7 ln(ππ πππ‘ ) + π8 ln(πππ‘−1 ) + π9 ln(π΅πΆπ‘ ) + π10 ππΈπΆπ»π‘ + π11 ππΈπΆπ»πππ‘ + π12 (πππ΅πΈπ‘ ∗ π πΉπΊπ‘ ) + µπ‘ (6) The use of 2SLS enables the U.S. ethanol supply curve (2) to be re-specified as the second stage in 2SLS: 2nd Stage of 2SLS Μπ‘ ) + π΅2 ln(πππ‘−1 ) + π΅3 ln(ππππ‘ ) + π΅4 ln(ππ ππ‘ ) + B7 ln(BCt ) + π΅6 (ππΈπΆπ»π‘ ) + ln( ππ‘π ) = π΅π + π΅1 ln(ππ π΅7 (ππΈπΆπ»πππ‘ ) + πt (7) 15 5.2 U.S. Ethanol Demand Equation 1st Stage of 2SLS ln(πππ‘ ) = ππ + π1 ln(πππ‘ ) + π2 ln(ππΆπΌπ‘ ) + π3 π πΉπΊπ‘ + π4 ππ·π‘ + π5 πππ΅πΈπ‘ + π6 ln(ππππ‘ ) + π7 ln(ππ πππ‘ ) + π8 ln(πππ‘−1 ) + π9 ln(π΅πΆπ‘ ) + π10 ππΈπΆπ»π‘ + π11 ππΈπΆπ»πππ‘ + π12 (πππ΅πΈπ‘ ∗ π πΉπΊπ‘ ) + µπ‘ (8) 2nd Stage of 2SLS Μπ‘ ) + π΅2 ln(πππ‘ ) + π΅3 ln(ππΆπΌπ‘ ) + π΅4 (π πΉπΊπ‘ ) + π΅5 (WDt ) + π΅6 (πππ΅πΈπ‘ ) + ln(ππ‘π ) = π΅π + π΅1 ln(ππ π΅7 (πππ΅πΈπ‘ ∗ π πΉπΊπ‘ ) + ππ‘ (9) 5.3 Constructing MTBE Instrumental Variable Ideally the MTBE ban instrumental variable would be constructed as: πππ΅πΈπ‘ = ∑π π=1 π π‘ππ‘πππ‘ (πππ‘ππ π.π.πΏπππππ ππ π·πππ£πππ )π‘ (10) Where MTBEt = Percentage of U.S. population living in states with MTBE bans in period t. statei = population of a state with an MTBE ban in period t. It is not possible to construct the MTBE ban instrumental variable this way since “total licensed U.S. drivers” is already in the left hand side variable. That is quantity supplied and demanded is expressed as quantity per licensed U.S. driver. Using “total U.S. licensed drivers” in both a dependent and explanatory variable will cause the system of equations to potentially suffer from spurious correlation between the right and left hand side variables. Eliminating this correlation can be accomplished by using an instrumental variable for MTBEt. Constructing an instrumental variable can be accomplished by calculating average state populations for the last 16 27 (only have ethanol data from 1982 to 2008). The instrumental variable becomes the sum of the 27 year average population for states with an MTBE ban divided by the 27 year U.S. average population. The instrumental variable can be expressed as follows: πππ΅πΈπ‘ = ∑π π=1 π π‘ππ‘πππ‘ π.π.πππ (11) MTBEt = equals proportion of U.S. population, based on 27 year population averages living in states with MTBE bans. n= index number for states with MTBE bans stateit = 27 year average (1982 – 2008) population for a state with an MTBE ban in time t U.S. Pop = 27 year average (1982 – 2008) U.S. population 6. Empirical Results Regression 1 in table 2 is the system of equations discussed above, except with only OLS and not a 2SLS . Although the magnitudes and signs in the OLS equation are very similar to those in the 2SLS equation it is know from the construct of supply and demand equations that the ethanol price is endogenous. Therefore use of OLS is inappropriate since allowing the presence of an endogenous variable will bias coefficient estimates. When moving to 2SLS to resolve endogeneity problems the ln(ethanol price) and ln(U.S. per capita income) become statistically significant. Regression (3) is the same system of equations as regression (2) except for regression 2 includes a linear and quadratic time trend to proxy for a technology trend in the supply equation and regression (3) does not. Regression 3 has wrong signs on most coefficients including the price of ethanol. This suggests that for the demand curve to be identified, time trends must be included in the supply equation. This makes sense since there has been considerable technological progress in the ethanol industry in the last 27 years. For reasons previously discussed, Regression (2) does a better job than any other regression the author experimented with in terms of number of statistically significant variables having signs consistent with economic theory and having instrumental variables which based on theory are valid instruments. Therefore a series of tests were conducted on regression 2 to see if is a correctly specified system of equations. 17 Table 2. Estimates of Factors Affecting U.S. Ethanol Supply, 1982 – 2008 Explanatory Variables Reg (1) Reg (2) Reg (3) OLS 2SLS with Lagged Corn and interaction term Reg (4) 2SLS with Lagged Corn and interaction term, no time trends 2SLS with corn and interaction Supply Eq.: Dep. Var. = ln(Qd) Intercept ln(ethanol price) ln(corn price lagged one year) -1.17 -1.52 0.68 -1.71 (1.44) (1.5) (3.2) (1.63) .88 1.30 -.7 1.37 (0.34)* (0.41)*** (0.73) (0.45)*** -0.33 -0.44 -0.012 (0.265) (0.28) (0.65) ln(corn price) -.52 (0.39) ln(soybean meal price) .48 0.51 .01 .57 (0.273)* (0.28)* (0.656) (0.34) (0.80) -0.93 0.73 -0.98 (0.2574)** (0.28)*** (0.41)* (0.3)*** 2.40 2.56 (2.04) 2.58 (0.5)*** (0.52)*** (0.61)*** (0.53)*** 0.06 0.09 0.09 (0.05) (0.05) (0.05) .005 0.00 .004 (0.002)** (0.002)** (0.002)** R2 0.96 .96 .75 .96 Root Mean Squared Error 0.18 .18 .43 .19 ln(natural gas price) ln(federal blenders credit) linear time trend quadratic time trend Demand Eq.: Dep. Var. = ln(Qd) Intercept ln(ethanol price) ln(gasoline price) ln(U.S. per capita income) reformulated gasoline dummy winter oxygenate dummy mtbe ban dummy 20.51 -12.98 -38.3 -13.33 (8.64)** (11.28) (19.91)* (11.23) -.76 -1.55 1.12 -1.52 (0.53) (0.89)* (1.86) (0.89) 0.11 .90 -1.75 .87 (0.63) (0.96) (1.91) (0.96) 2.18 1.46 3.89 1.49 (0.83)** (1.09) (1.91)* (1.08) -0.23 -0.24 -0.22 -0.24 (0.16) (0.17) (0.21) (0.167) 0.06 0.01 0.165 0.02 (0.14) (0.15) (0.21) (0.15) 2.48 2.55 2.31 2.55 (0.77)*** (0.81)*** (1)** (0.81)*** 18 mtbe_rfg -.55 -.63 -.35 -.63 (0.53) (0.57) (0.72) (0.57) R2 0.96 0.96 0.93 0.96 Root Mean Squared Error 0.18 0.19 0.23 0.19 27 27 27 27 Number AnnualObservations Notes: Estimates are significant at the 5 percent (*p<.1), 1 percent(**p<.05), or .1 percent (***p <.01) level. Table 3 discusses 4 tests conducted on regression (2). In order for instruments to be valid they must meet both the relevance and the exogeneity criterion. In the first stage of supply which some refer to as the reduced form an F test tests whether demand shifters are jointly significant. Similarily in the 1st stage of the demand equation an F test tests whether supply shifters are jointly significant. It is found that the “demand shifters” are relevant instruments in the supply equation. This means shifts in the demand curve enables identification of the supply curve. In contrast, the “supply shifters” are not relevant instrument for the demand curve. Therefore F tests presented in Table 4 test each “supply shifter” individually to see if any of the supply shifters are able to individually identify the demand curve. Table 4 shows that the only “supply shifter” that is a relevant instrument and can identify demand is the quadratic time trend. This makes sense since we could not get correct signs on coefficients or get many coefficients to be statistically significant until the linear and quadratic time trends were added to regression (3) to form regression (2). This means that further work needs to be done specifying the supply equation so that the demand equation can have relevant instruments. In addition to relevancy, an instrument must also be exogenous to be a valid instrument. The overidentification test finds that both the supply and demand equations have exogenous instruments. This is important because since all instruments in the system of equations are exogenous, the 2SLS residuals should be uncorrelated with the instruments. 19 Table 3. Tests Applied to Supply and Demand Equation Question Are Instruments Relevant? Supply Yes Pr(>F) = .002 Demand No Pr(>F)=.2495 Are Instruments Exogenous? Yes, 2 P value from π11 distribution is .99 No, autocorrelation not statistically significant at 95% level according to Figures 7 and 8 No, correlation between supply and demand curve residuals is .15. This makes autocorrelation unlikely. Yes, 2 P value from π11 distribution is .99 No, autocorrelation not statistically significant at 95% level according to Figures 5 and 6 No, correlation between supply and demand curve residuals is .15. This makes autocorrelation unlikely. Is Their Autocorrelation? Is Their Autocorrelation? Test Used F test for joint significance of supply or demand shifters in 1st stage Overidentification Test Autocorrelation Plots and Partial Autocorrelation Plots of Residuals Cross Equation Error Correlation Test Table 4. Testing Which Supply Equation Variables are Valid Instruments (Valid “supply shifters” to indentify Demand Equation) Variable ln(corn price lagged one yr) ln(natural gas price) ln(soybean meal price) ln(federal blenders credit) ln(linear time trend) ln(quadratic time trend) Pr(>F) .85 .1218 .73 .51 .86 .09 Valid Instrument? No No No No No Yes When using time series it is also important to identify whether or not autocorrelation is present. Granted there are specific tests such as the Breusch-Godfry test that can be used to test the presence of autocorrelation, but for this model, autocorrelation plots and partial autocorrelation plots are adequate to identify whether or not autocorrelation of the error terms exists. Figures 5 and 7 are Autocorrelation plots of the demand and supply curve residuals respectively while Figures 6 and 8 are Partial Autocorrelation plots of the Demand and Supply curve residuals respectively. Since all of the non-zero lags stay within the interval bounded by the blue hashed line then autocorrelation is not statistically significant at the 95% level for supply or demand. 20 Figure 6. Partial Autocorrelation of Demand Curve Res. -0.4 -0.4 -0.2 -0.2 0.0 Partial ACF 0.4 0.0 0.2 ACF 0.6 0.2 0.8 1.0 0.4 Figure 5. Autocorrelation of Demand Curve Res. 2 4 6 8 10 2 4 6 8 Lag Lag Figure 7. Autocorrelation of Supply Curve Res. Figure 8. Partial Autocorrelation of Supply Curve Res. 10 -0.4 -0.4 -0.2 -0.2 0.0 Partial ACF 0.4 0.0 0.2 ACF 0.6 0.2 0.8 1.0 0.4 0 0 2 4 6 8 Lag 10 2 4 6 8 10 Lag Although a weaker test, cross equation correlation between the supply and demand equations can also be tested to see whether autocorrelation exists. Since correlation between the supply and demand curve residuals is .15, it is unlikely that there is autocorrelation in the error structure of the supply or demand equations. This means that 2SLS sufficiently models the system of equations and that it unnecessary to go to 3SLS and do a GLS cleanup of the error structure to account for autocorrelation. To gain a sense of how well the system of equations predicts the quantity of ethanol supplied and demanded the predicted quantity demanded can be plotted against the predicted quantity supplied. Although not a quantitative test, Figure 9 shows the system of equations does a good job predicting quantity of ethanol demanded and supplied. This is suggested by the positive linear relationship between predicted quantity supplied and predicted quantity demanded. 21 2.5 2.0 1.5 1.0 0.5 Predicted ln(Quantity Supplied) in gallons / licensed driver 3.0 3.5 Figure 9. Relationship Between Predicted ln(Quantity) Demanded and Supplied, 1982 - 2008 1.0 1.5 2.0 2.5 3.0 3.5 Predicted ln(Quantity Demanded) in gallons / licensed U.S. driver If the system of equations is well specified the relationship between the supply and demand equation residuals should exhibit no systematic relationships and just exhibit white noise. Figure 4. Shows the relationship between the residuals from ethanol’s supply and demand equation residuals is just white noise. This suggests the model is well specified. 22 0.1 0.0 -0.3 -0.2 -0.1 Residuals FromSupply Equation 0.2 0.3 Figure 4. Relationship Between Residuals From Ethanol Supply and Demand Equations, 1982 - 2008 -0.4 -0.3 -0.2 -0.1 0.0 0.1 0.2 0.3 Residuals From Demand Equation Although Regression (2) in Table 2 shows that the MTBE ban dummy variable is statistically significant, while the interaction term between the MTBE ban dummy variable and the Reformulated Gasoline Program dummy variable is not, it is important to focus on the economic question of interest. It is important to determine the net effect of passing a MTBE ban in an additional state when the Federal Reformulated Gasoline program was in place from 1995 to 2006. Determining the net effect requires taking the partial derivative of ln(quantity demanded) with respect to MTBE in the second stage of the supply equation (9). Therefore the net effect of passing an MTBE ban in an additional state is: πln(ππ‘π ) ππππ΅πΈπ‘ = π΅6 + π΅7 π πΉπΊπ‘ (12) Applying estimated coefficients from regression (2) in table 2 the net effect of passing an MTBE ban in an additional state is: πln(ππ‘π ) ππππ΅πΈπ‘ = 2.55 − .63π πΉπΊπ‘ 23 Since RFG is a 0/1 dummy variable then increasing the percent of the total population living in states with MTBE bans when the RFG was in place from 1995 to 2005 would have increased the quantity of ethanol demanded by (2.55-.63)*.1 = .192%. One must remember that in equation (9) quantity is logged, but MTBE, RFG and the interaction term are not. If a state ban on MTBE was implemented after the RFG program was removed that is in 2006 to 2008, then increasing the percent of the total population living in states with MTBE bans would have increased the quantity of ethanol demanded by .255%. However, Table 2 shows that B7 is not statistically significant. This means that the effect of an MTBE ban does not depend on whether or not RFG program is in place. In other words increasing the percentage of the total population living in state with MTBE bans by 10% will always lead to a .255% increase in the quantity of ethanol demanded regardless of whether or not the RFG program is in place. The supply curve component of regression 2 shows that ln(ethanol price), ln(soybean meal price), ln(natural gas price) and quadratic time trend proxying for technology were all statistically significant and of correct sign. It is surprising that the lagged price of corn was not statistically significant since corn is the number one input used in ethanol production. However, the correlation between corn price lagged one year and the soybean meal price is .78. This means the multicollinearity between the lagged corn price and the soybean meal price may be biasing the coefficient estimates and causing the lagged corn price not to be statistically significant. Future work must address the potential multicollinearity problem between the lagged corn price and soybean meal price to ensure coefficient estimates are not biased. The coefficient of 1.3 on ln(ethanol price) in the supply equation says that the elasticity of supply is elastic. Stated differently, a 1% increase in the price of ethanol leads to a 1.3% increase in the quantity of ethanol supplied. It also important to note ethanol supply is highly responsive to changes in the federal blenders credit. A 1% increase in the federal blenders credit leads to a 2.4% increase in the quantity of ethanol supplied. It is also worth noting that although the quadratic time trend is statistically significant, its economic significance is small. It says that new technology being used in ethanol production causes ethanol production to increase at an increasing rate. When time trends were introduced into Demand equation they were not statistically significant. In the future, an actual measure of ethanol plant efficiency such as the number of gallons of ethanol produced from one bushel of corn at new ethanol plants should be used in place of a quadratic or linear time trend if possible. The statistical significance of the 24 quadratic time trend is consistent with the conclusion drawn from the pairwise plots that there are non-linear trends in the ethanol supply curve. Both the supply and demand curves in regression (2) have an R square value of .96 meaning that 96% of the variation in the quantity of ethanol produced and consumed is explained by variation in the explanatory variables, which is good, but one should not place to much confidence in R squared values. The demand curve in regression (2) provides valuable insight into the impact government policies have on the quantity of ethanol demanded. Ln(ethanol price), and the MTBE ban dummy variable have a statistically significant impact on ln(Qt). The elasticity of demand for ethanol is found to be elastic. A 1% increase in ethanol price leads to a 1.55% decline in the quantity of ethanol demanded. It is not surprising that only 2 of the seven explanatory variables in the ethanol demand equation are statistically significant since Table 3 showed that the instruments used to identify the demand equation are not relevant. In addition to better specifying the supply equation so that the supply shifters allow better identification of the demand equation, additional work needs to be done on the demand equation to create instrumental variables for the reformulated gasoline program and the winter oxygenate program similar to the MTBE instrumental variable since the percentage of the population living in areas affected by the Reformulated Gasoline Program and Winter Oxygenate Program changed over time. If a technique is developed to run a system of equations with state level panel model then it would be easier to include to control for state specific policies such as Minnesota’s 10% ethanol blend mandate. 7. Conclusion This study demonstrates why the majority of work on ethanol production has been conducted using simulation models not system of equations. Data limitations, endogeneity and multicollinearity are foremost concerns when modeling the U.S ethanol market. Although the ethanol supply equation was well identified by the “demand shifters” of Pg, PCI, RFG, WD, RFG*MTBE and MTBE the demand equation was unable to be identified by the supply shifters. The supply shifters used were Png, Pct-1, Psm, BC, TECH and TECHSQ. When tested individually instead of jointly, only TECHSQ was found to be a relevant instrument. This 25 implies that the technology used in ethanol production has caused the quantity of ethanol produced to increase at an increasing rate. Future research is necessary to correctly specify the supply equation so the demand equation can be identified. It is likely that multicolinearity between the lagged corn price and soybean meal price is causing the “supply shifters” not to be valid instruments that can be used to identify the demand curve. Besides the failure to identify the demand equation all other measures suggest the system of equations is well specified. 26 Works Cited Food and Agricultural Policy Research Institute (FAPRI). “State Support for Ethanol Use and State Demand for Ethanol Produced in the Midwest.” University of Missouri, Columbia, Nov. 2008. Greene, William H. Econometric Analysis. Upper Saddle River: Prentice Hall, 2003. Johnson, Craig A. “Plants Return to Production After Idle Summer.” Ethanol Producer Magazine. (December 2009). Lin, C.-Y. Cynthia. “Estimating Annual and monthly Supply and Demand for World Oil: A Dry Whole?” 29 March 2004. Department of Economics, Harvard University. Low, S.A. and A.M. Isserman. ‘‘Ethanol and the Local Economy: Industry Trends, Location Factors, Economics Impacts and Risks” Economic Development Quarterly. 23,1(Feb. 2009): 71-88. Luchansky, M. and J. Monks. “Supply and Demand Elasticities in the U.S. Ethanol Fuel Market.” Energy Economics. 31 (2009): 403 – 410. Montana State University’s Department of Agricultural Economics and Economics, “Futures Data” [computer file]. Bozeman, MT: Chicago Board of Trade[producer], 8 Dec 2009. Nebraska Energy Office. “Ethanol and Unleaded Gasoline Average Rack Prices.” Nebraska Energy Statistics. 5 Nov. 2009. Nebraska Energy Office. 25 Nov. 2009 < http://www.neo.ne.gov/statshtml/66.html>. Rask, Kevin N. “Clean Air and Renewable Fuels: the Market for Fuel Ethanol in the US from 1984 to 1993.” Energy Economics. 20 (1998): 325-345. Renewable Fuels Association. (2010). Public Policy. Retrieved April 20, 2010, from http://www.ethanolrfa.org/pages/public-policy. United States. Department of Commerce. “Price Indexes For Gross Domestic Product.” Bureau of Economic Analysis. 24 Nov. 2009. U.S. Department of Commerce. 8 Dec. 2009 < http://www.bea.gov/national/nipaweb/TableView.asp?SelectedTable=4&Freq=Qtr&First Year=2007&LastYear=2009>. 27 United States. Department of Energy. “Energy Timelines, Ethanol.” Energy Information Administration. June 2008. U.S. Department of Energy. 25 Nov. 2009 <http://tonto.eia.doe.gov/kids/energy.cfm?page=tl_ethanol>. United States. Department of Energy. “November 2009 Monthly Energy Review.” Energy Information Administration. 25 Nov. 2009. U.S. Department of Energy. 25 Nov. 2009 <http://www.afdc.energy.gov/afdc/data/fuels.html>. U.S. Environmental Protection Agency. “Renewable Fuel Standard Program.” 2010. Available at http : //www.epa.gov/oms/renewablefuels/ (accessed February 27, 2010). United States. Environmental Protection Agency. “State Actions Banning MTBE.” August 2009. U.S. Environmental Protection Agency. March 2010 < http://www.epa.gov/mtbe/ 420b04009.pdf>. United States. Census Bureau. “2010 Statistical Abstract.” February 2010. U.S. Census Bureau. April 2010 < http://www.census.gov/compendia/statab/cats/population.html >. U.S. Department of Agriculture. “Feed Outlook.” Economic Research Service. 14 Jan. 2010. U.S. Department of Agriculture. 5 April 2010 <http://usda.mannlib.cornell.edu/usda/ers/FDS//2010s/2010/FDS-01-14-2010.pdf>. United States. Department of Transportation. “Highway Statistics 2007.” Federal Highway Administration. 29 April 2009. U.S. Department of Transportation. 25 Nov. 2009 <http://www.fhwa.dot.gov/policyinformation/statistics/2007/dlchrt.cfm >. 28 29