TAX NON-COMPLIANCE AND CORPORATE GOVERNANCE: A COMPARATIVE STUDY* Lindsay M. Tedds Department of Economics, University of Manitoba 503 Fletcher Argue Building Winnipeg, Manitoba, Canada, R3T 5V5 Email: tedds@cc.umanitoba.ca Phone: (204) 474-9275 This version: May 2006 Abstract It is generally accepted that taxes and tax evasion are intrinsically linked. While there exists an extensive knowledge base regarding tax compliance by individuals, very little is actually known about tax compliance by businesses due to a lack of detailed and readily available data. This paper uses a unique and recently available dataset that contains detailed information on over 10,000 firms (broadly defined) from around the world to investigate factors, particularly corporate governance, that affect business tax compliance. The empirical strategy employed exploits the nature of the dependent variable, which is interval coded, and uses interval regression which provides an asymptotically more efficient estimator than the ordered probit, provided that the classical linear model assumptions hold. These assumptions are investigated using standard diagnostic tests that have been modified for the interval regression model. Evidence is presented that shows that firms in all regions around the world engage in tax noncompliance, but that there is substantial variation across and within regions. The legal organization of a business is found to affect tax compliance, as does size, ownership, competition and audit controls but industry and age has no effect. Not surprisingly, high taxes and government corruption result in reduced compliance while access to financing, organized crime, political instability and the fairness of the legal system do not affect compliance levels. Keywords: Underground Economy, Tax Non-compliance, Firm Characteristics, Interval Regression JEL Classification: C24, D21, O17 * I would like to thank Tom Crossley, Mike Veall and David Bjerk, all of McMaster University, David Giles, University of Victoria, Wayne Simpson, University of Manitoba, seminar participants at the University of Victoria, and participants at the 2005 Tax Policy: Perspectives from Law and Accounting conference for their helpful comments and invaluable guidance, Hassan Azziz for his assistance with the literature review, and Adian McFarlane for his diligent research support. The author would also like to gratefully acknowledge the financial support from the Social Sciences and Humanities Research Council (#752-2004-1096 and #501-2002-0107) and the University of Manitoba. I am solely responsible for any remaining errors and omissions. TAX NON-COMPLIANCE AND CORPORATE GOVERNANCE: A COMPARTATIVE STUDY 1 Introduction It is generally accepted that taxes and tax evasion are intrinsically linked; one cannot exist without the other. As a result of a great deal of theoretical, experimental, and empirical research conducted over the last twenty years, there exists an extensive knowledge base regarding tax evasion by individuals. However, research regarding tax evasion1 by businesses is, by comparison, surprisingly modest. This is startling, given the importance of businesses and their decisions not only in economic models but also in tax systems and the economy as a whole. There is some evidence to suggest that there is cause for concern, that a substantial share of business income goes unreported to the relevant tax authorities. The United States Internal Revenue Service (IRS) routinely estimates the total amount of under-reported income and overstated deductions, and calculates the total loss of tax revenue, or the “tax gap”. The latest data from the IRS (2004) regarding the “tax gap” related to business activities are for the 2001 tax year. These estimates, denoted in U.S. dollars, indicate that: (1) the corporate tax gap amounted to $29.9 billion, of which corporations with over $10 million in assets contributed $25.0 billion; (2) the tax gap associated with business income earned by individuals amounted to $81.2 billion; (3) the selfemployed evaded $61.2 billion in employment tax; and (4) corporations underpaid the amount of taxes due based on reported income by $2.3 billion. In total, businesses evaded $174.6 billion in taxes, which amounted to almost 10% of total taxes paid voluntarily. This is not an insignificant 1 The focus of this paper is on illegal tax evasion and not legal tax avoidance. Tax evasion or tax non-compliance refers to income tax that is legally owed but is not reported or paid whereas tax avoidance refers to legal actions taken to reduce tax liability. Tax avoidance includes such activities as “…purchasing tax-exempt bonds, which is certainly legal, not at all nefarious, but also certainly done for tax reasons.” (Slemrod 2004, 4) 2 amount and yet should be considered a lower bound estimate because: it is based on twenty year old compliance rates; it does not include businesses that do not file tax returns (also known as nonfilers, the hard-to-tax, ghosts or informal businesses); and/or it does not consider firms that are engaged in illegal activities. While these data are for the United States, it is not unreasonable to assume that firms evade taxes in every country around the world. Given that there is some evidence which supports the notion that businesses engage in tax non-compliance, several questions arise and require investigation. First, do businesses around the world engage in tax evasion or is it confined to a few countries or regions? Second, does the legal status of the business (e.g. sole proprietorship, partnership, corporation, etc.) affect the incidence and/or intensity of tax evasion? If so, then it may be possible to effect changes in the legal system in order to increase tax compliance. Third, do businesses that engage in tax non-compliance share common and observable characteristics, or is there too wide a variety of shapes and sizes to permit a useful generalization about them? If it were possible to define a typical evader, the tax authority could target their auditing activities more accurately. Finally, while there is considerable agreement internationally about the factors that trigger tax non-compliance (e.g. the tax burden, the degree of regulation, the level of enforcement, confidence in government, labour force characteristics, and morality), how do these features influence the intensity of non-compliance? With this information, policy makers could effect changes to increase the amount of tax revenues collected from businesses. One of the main constraints to investigating and attempting to provide answers to these and related questions is the lack of data. Previously, the only data sources available are from tax audits. However, these sources are only available for a very small number of select countries, the data is costly to collect, and access to the data is limited. In addition, tax audit data is subject to both 3 selection and measurement error bias. The selection effect occurs because firms are generally not randomly audited. Rather, firms are usually selected for an audit because they have a particular characteristic (e.g. U.S. businesses with more than $10 million in assets are audited annually) or because their tax filing documents raise a “red-flag” with the tax authority. The measurement error bias occurs because not only are audits unlikely to completely and accurately measure tax noncompliance but also the final non-compliance report is usually based on negotiations between the firm and the tax authority as well as various legal appeals. More recently, however, an alternative data source has become available that is conducive to investigating issues related to business tax non-compliance. The World Business Environment Survey (WBES) was launched in 1999 by the World Bank’s Investment Climate and Institute Units. The survey was administered to more than 10,000 firms in eighty countries in late 1999 and early 2000, and provides information regarding a firm’s characteristics and well as responses to multiple questions on the investment climate and business environment. The survey also includes a question regarding the extent and intensity to which firms fail to report income to the tax authority. Rather than reporting the exact amount of income that went unreported to the tax authority, firms were provided with seven possible intervals of varying size that were grouped according to the percentage of sales that are reported for tax purposes. When a quantitative outcome is grouped into known intervals on a continuous scale the data is said to be interval-coded. Asymptotically consistent estimates can be obtained using interval regression. The advantage of interval regression is that, provided certain key assumptions are met, the coefficients on the exogenous variables can be interpreted as if ordinary least squares were applied to a continuous dependent variable. Overall, the findings reveal that firms which are sole proprietorships and privately owned corporations report a smaller percentage of their sales to the tax authority. The previous literature 4 has suggested that firms in the service and construction sector should be less compliant but this study finds that industry sector has no effect on compliance. There is no consensus in the previous literature about the relationship between firm size and under-reporting, but it is found, unambiguously, that small firms are less and large firms are more compliant. Foreign owned firms, government owned firms, and firms that have audited financial statements are also more compliant, as was also found by other researchers. Not surprisingly, high taxes and government corruption result in lower compliance and the magnitude of these effects are quite large. Access to capital, political instability, organized crime, inflation, and the legal system are found to have no statistically significant effect on tax non-compliance. The paper begins with a brief review of the relevant tax non-compliance literature. This literature makes it clear that a businesses legal organization likely affects the decision to evade taxes but other factors, such as firm size, have ambiguous effects or, at the very least, vary across countries. Therefore, it seems worthwhile to conduct a worldwide study of firms’ tax compliance behaviour. The WBES is then described and the rationale for the empirical techniques is outlined. The results are then summarized and the paper ends with some concluding comments. 2 Literature Review In this section, attention is focused on two critical aspects of the literature regarding tax evasion. First, a review of the development of the theoretical literature and the associated predictions is provided, commencing with the classical model of an individual’s decision to evade taxes and how the model has been modified for firm behaviour. Second, there have been a few empirical papers that explore firm tax non-compliance, and this literature is summarized along with the key findings. This literature will help shape the empirical model utilized in this paper. 5 2.1 Theoretical Studies As previously noted, extensive literature exists on tax evasion by individuals and the seminal contribution was provided by Allingham and Sandmo (1972).2 Their model was sparse but surprisingly robust in modeling an individual’s decision to evade taxes and, if so, how much to evade. The model leads to four propositions about the incidence of tax evasion: (1) the rate of return to evasion is positively related to the incidence of tax evasion; (2) individuals with higher risk aversion tend to evade less; (3) individuals with higher personal income tend to evade more; and (4) and compliance is positively related to the probability of being audited and the size of the penalty if caught.3 The first and third propositions leads one to conclude that, in the context of business tax non-compliance, the self-employed, sole-proprietors and other small businesses would be less likely to evade taxes than large businesses. The fourth proposition leads to the opposite conclusion given that large businesses and corporations are normally subjected to audits at a higher rate than are smaller businesses.4 However, the true effect of these propositions is muddied when the choice is not only the level of evasion but also the level of output, as with most businesses, and when there is no direct link between the owner’s income, risk preference, audit probability, and evasion decisions. This is particularly true for businesses where evasion decisions are made by managers and not the owner(s). 2 The Allingham-Sandmo (1972) model has been extended in a number of dimensions over the last thirty years (e.g. Watson 1985, Trandel and Snow 1999, Mookherjee and P’ng 1989, Border and Sobel 1987, Scotchmer 1987, Cremer, Marchand and Pestieau 1990, and Sanchez and Sobel (1993)) and this literature is nicely surveyed by Andreoni, Erard, and Feinstein (1998) and Slemrod and Yitzhaki (2002). 3 The relationship between tax rates and tax evasion, however, is uncertain. A higher tax rate implies less income which, according to the model, implies less evasion. On the other hand, an increase in the tax rate implies that the “return” from evasion increases as well, assuming a constant penalty, which implies more evasion. As a result, the two forces are of opposite tendency which makes the net result uncertain. 4 For example, in the United States corporations with more than $10 million in assets face an audit rate of nearly 100 percent. 6 Given the above discussion, the reasons driving many businesses to evade taxes is likely different from that of individuals and, hence, should be modeled differently. There is a much smaller pool of literature that addresses tax non-compliance by businesses.5 The majority of this work utilizes the basic framework of the Allingham-Sandmo (1972) model, but the key variation in these models is that they explore how the tax rate, probability of detection, and penalty rate affect the two choices of evasion and output.6 In these models, it is generally found that: (1) reported sales decrease as the tax rate increases; (2) a rise in taxes increases the market price of the good, but by less than the amount of the tax, since some of the tax increase is absorbed through increased evasion; and (3) an increased probability of detection or penalty increases the proportion of sales declared and the market price of the good. As opposed to the propositions of the AllinghamSandmo (1972) model, there is an unambiguous positive relationship between the tax rate and tax evasion. Further, it should be noted that these results hold, regardless of whether the market, in which the firm is operating, is competitive or monopolistic. The preceding literature, however, continues to assume that an individual is at the centre of the tax decision. That is, the above noted firm models assume that the firm owner makes the tax reporting decision. This assumption, however, likely only applies to the self-employed and other small, privately-held businesses. This suggests that the outcomes predicted by the AllinghamSandmo (1972) model may not apply to other types of businesses, notably those businesses where financial decisions, including those related to taxes, are not made by the owner/shareholder but, rather, by their agents. With this in mind, Chen and Chu (2002) extend the standard model to include a firm that hires a risk-averse manager who is offered some form of compensation (e.g. stock options, bonuses, etc.) as an incentive to engage in tax evasion on behalf of the firm. As 5 6 Cowell (2004) provides an excellent review of this literature. There is no agreement in this literature if the firm in these models should be modeled as risk-averse or risk neutral. 7 opposed to the finding of the Allingham-Sandmo (1972) model, that an individual will only evade taxes if the rate of return from evasion is greater than the rate of return from compliance, Chen and Chu’s model implies that a firm will evade tax only when the expected profit from evasion is significantly greater. This is because tax evasion by a business actually involves the interaction of many agents and is much more complicated than individual income tax evasion. The principal-agent model proposed by Chen and Chu, however, does not consider the role of penalties in the tax evasion decision. To address this shortcoming Crocker and Slemrod (2005) propose a model of “…corporate tax evasion in the context of the contractual relationship between the shareholders of a firm and the chief financial officer (CFO), who determines the firm’s deductions from taxable corporate income.” (p. 1594) The model implies that corporate tax evasion is reduced when penalties are imposed on the CFO directly, as opposed to the shareholder, and that tax evasion increases if the CFO’s compensation contract optimally adjusts to offset the penalties imposed when evasion is detected. 2.2 Previous Empirical Studies of Business Tax Non-Compliance Unfortunately, the empirical analysis of business tax evasion is not extensive, mainly due to a lack of data. There have, however, been a small number of empirical studies that will be summarized here. One of the first empirical examinations into business tax non-compliance focused on the self-employed.7 The self-employed are commonly believed to have lower compliance rates than wage and salary earners, primarily due to a lack of third-party reporting and withholding. Smith et 7 A self-employed person can register their business in a variety of forms, including a corporation, which, based on the theoretical literature, likely affects their tax compliance behaviour. Unfortunately, only Apel (1994) explored the relationship between tax non-compliance and the legal form of the business, largely due to the lack of information on the latter in the data employed by these studies. 8 al. (1986) obtain estimates that indicate that in 1982 the self-employed in Great Britain understated their income in the range of 30 to 36%. Pissarides and Weber (1989), using the same data, improve on Smith et al.’s approach and find that the self-employed under-reported their income by up to 90%. This modified approach was subsequently applied by: (1) Apel (1994) who estimates that the self-employed in Sweden under-reported their income by 25% in 1988, though this figure rises to 35% for the self-employed who own firms that are unincorporated (which supports the proposition that the legal organization of the business does affect the tax evasion decision); (2) Mirus and Smith (1997) concentrate their analysis on Canada and obtain an estimate of 12.5% for the year 1990; and (3) Schuetze (2002), who also applies the approach to Canada, finds that the selfemployed under-reported their income by between 11 and 23% over the period from 1969 to 1992 and that the construction and service occupations are more likely to be involved in tax noncompliance. Lyssiotou et al. (2004) propose further modifications and conclude that the selfemployed in Great Britain in 1993 under-reported their income by 118% if they were in blue collar occupations and 64% if they were in white collar occupations. Finally, Tedds (2005) introduces a nonparametric framework to the approach and found that the gap between true and reported selfemployment income is larger for households at the lower end of the self-employment income distribution, a result which runs contrary to the theoretical prediction of the Allingham-Sandmo (1972) model. Smith and Adams (1987) examine the extent of tax non-compliance by informal suppliers in six “at risk” sectors8 using results from a survey commissioned by the U.S. Internal Revenue Service (IRS) on the expenditures made by consumers on goods and services provided by these suppliers. For the 1992 tax year, the authors find that unreported income by these informal 8 They were: home repairs and additions, domestic services, auto repair, music lessons, appliance repairs, cosmetic services, and catering. 9 suppliers amounted to approximately US$59.6 billion and that informal suppliers tend to report only about 20% of their net business income. Several studies have investigated business tax non-compliance using data from tax audits. Probably the most comprehensive tax audit dataset in the world is the U.S. IRS Tax Compliance Measurement Program (TCMP).9 Rice (1992) used TCMP data from 1980 to investigate tax compliance by small corporations (defined as corporations with assets between $1 and $10 million). His findings suggest four key results. First, compliance is higher among publicly traded corporations, which he attributes to the requirement that publicly traded corporations must disclose more information to the public about their operations.10 Second, high profit companies are more likely to under-report their income, and corporations whose profits are below the industry mean tend to resort to non-compliance, perhaps as a means of limiting costs. Third, the marginal tax rate is negatively associated with compliance. Finally, firm size and tax non-compliance are positively related, and corporations engaging in tax non-compliance appear to be geographically bundled. The latter finding is interesting and may be evidence that the compliance behaviour of others directly affects ones own compliance behaviour. Joulfaian (2000), using TCMP data from 1987, found that non-compliant corporations are three time more likely to be managed by executives who have evaded personal taxes. TCMP data has also been used to explore tax evasion by the selfemployed and these studies include Christian (1994), Erard (1992), and Joulfaian and Rider (1998). Hanlon et al. (2005) are the first to the use operational data from the Voluntary Compliance Baseline Measurement (VCBLM) program compiled by the Large and Mid-Sized Business 9 The TCMP features data from a random sample of individual and small corporate income tax returns filed in a given year that were subject to intensive audits by experienced examiners. For certain groups, such as non-filers and proprietors who tend to not report a significant amount of their income, the results from special research studies are used to supplement TCMP data. Finally, data for large corporations are obtained from routine operational audits. 10 Tannenbaum (1993), however, disagrees with this statement and instead argues that higher compliance could be the result of managers in these corporations having greater independence from the owners. 10 (LMSB) Research Division of the IRS to examine corporate tax non-compliance, as measured by deficiencies identified during audit. This data is more up to date, containing audit files up to and including 2002, than the TCMP, which ended in 1988. They find that business tax non-compliance relative to scale is a convex function, with medium-sized businesses (within the population of firms with assets exceeding $10 million) having the lowest rate of non-compliance. Private firms, multinationals and firms with incentivized executive compensation schemes are associated with higher non-compliance (providing empirical evidence for the predictions from the model proposed by Crocker and Slemrod (2005)) while foreign controlled firms are more compliant. In general, firms in the manufacturing industry, trade, transportation and warehousing industry, and education, healthcare and warehousing industry are less compliant. They also find no association between compliance and effective tax rates. Giles (2000) discusses some of the factors that determine the probability of non-compliance among a very large population of New Zealand businesses that were audited by the Inland Revenue Department in that country between 1993 and 1995. Contrary to Rice (1992) and Hanlon et al. (2005), Giles finds that an increase in the scale of the business, regardless of how this is measured, unambiguously raises the probability of compliance, once other characteristics are controlled for. That is, businesses that are relatively small, in terms of sales revenue or before-tax profit, are more likely to evade taxes than are large corporations, all other things being equal. Businesses in the construction, wholesale trade, retail trade, accommodation, and cafes and restaurants sectors exhibited below-average compliance rates over the study period. He also considers several other characteristics that were found to be important in reducing the tax compliance rate among New Zealand businesses. Relatively inefficient businesses tended to be less compliant than more efficient ones, where efficiency was defined as either “return on net assets”, or “activity ratio” 11 (sales as a percentage of net assets). In addition, businesses which were registered off-shore were generally more compliant than their on-shore counterparts, a finding that is shared by Hanlon et al. (2005). Finally, it was found that in general, an aggressive use of legitimate tax-minimization instruments (i.e. tax avoidance behaviour such as the deduction of interest and depreciation costs, and the writing-off of bad debts) tended to be associated with increased compliance. Many countries use tax holidays to attract foreign investment by providing a limited period of tax exemptions and reductions for qualified investors. Chan and Mo (2000) examine the effect of tax holidays on foreign investors’ tax non-compliance behaviour in China. They analyzed 583 tax audit cases, made available by the Chinese tax authorities, on corporate tax non-compliance by foreign investors. Their results indicate that the corporate taxpayers tax holiday position significantly affects non-compliance, notably: (1) companies in the pre-holiday position are least compliant; (2) companies are most compliant in the tax exemption period that has a zero tax rate and a heavy penalty for evasion; (3) domestic market-oriented companies have a higher rate of noncompliance than their export-oriented counterparts; and (4) wholly foreign-owned and manufacturing-oriented companies have higher compliance than joint ventures and service-oriented companies. Only two studies appear to exist which use data on firm-reported tax non-compliance. Using firm level data from a 1997 survey of private manufacturing firms in Poland, Romania, Russia, Slovakia, and the Ukraine, Johnson et al. (2000) investigate the relationship between government corruption, criminal activities, and firm tax compliance. The dependent variable (percentage of sales that are unreported to the tax authority) is similar to that which is used in this 12 paper11 (discussed below), except that it is a continuous variable (rather than grouped data as in the WBES). The authors found a positive and significant relationship between under-reporting of sales and bribing of corrupt officials, but no relationship between under-reporting of sales and protection payments to the mafia, tax payments, or efficiency of the legal system. Finally, firm tax noncompliance was greater in Russia and the Ukraine than in the other countries included in the study. Batra et al. (2003) were the first to use the WBES to investigate the determinants of underreporting by firms. The WBES asks each firm to provide an estimate of the percentage of sales revenues that firms like their own report to the tax authority. As indicated previously and discussed in more detail later in this paper, seven answers were possible: all, 90-99%, 80-89%, 70-79%, 6069%, 50-59%, and less than 50%. The authors define seven binary variables based on these seven intervals and each binary variable takes a value of one if the firm selected the particular interval of interest and zero otherwise. They then estimate seven different OLS regressions which include variables for firm characteristics, rule of law, business constraints and country fixed effects. Overall they find that: (1) “…small or medium-size firms that produce for the domestic market (non-exporters), lack foreign investment, and are located in large cities (but not necessarily in the capital) tend to engage more in unofficial activity” (p. 76); (2) the prevalence, though not the unpredictability, of corruption also significantly affects non-compliance; and (3) “…a firm’s age, sector or mode of ownership do not influence [a firm’s] under-reporting of revenue.” (p. 78). There are several reasons to be concerned about these reported results. First, the number of observations in each of the seven regressions ranged from a low of 3,802 to a high of 4,781 from a total number of observations of over 10,000 available in the WBES. The authors do not discuss the 11 The question regarding the percentage of sales not reported was prefaced by “It is thought that many firms in your industry, in order to survive and grow, may need to misreport their operational and financial results. Please estimate the degree of under-reporting by firms in your area of activity.” This is similar wording to that which appears in question regarding tax non-compliance in the WBES. 13 reasons for the attrition but it is likely the result of missing observations and non-response for both the dependent and independent variable. As attrition of this magnitude can lead to biased results, it is important that it be investigated further. Second, the authors do not link their choice of explanatory variables to the existing theoretical or empirical literature regarding firm tax noncompliance. As a result, some potential controls are overlooked and some controls may be misspecified. Third, the equations across categories are not all identically specified and it is not clear if the results should be compared across regressions, given the different specifications. Finally, Batra et al. (2003) do not exploit the nature of the dependent variable. The firm’s response regarding under-reporting behaviour is grouped into categories. When a quantitative outcome is grouped into known intervals on a continuous scale, the data are said to be “interval-coded”.12 However, Batra et al. (2003) define the dependent variable as a binary outcome for each category and estimate the resulting equations by Ordinary Least Squares (OLS). The considerable statistical limitations of such a linear probability model are well known.13 There is an estimation technique that has been developed specifically for interval-coded data. This estimation procedure is known as “interval regression” and is undertaken using maximum likelihood techniques. This paper addresses these shortcomings. In particular, it uses the interval regression technique and the extent to which various covariates affect the estimation results is also explored. In addition, this paper uses information from existing theoretical and empirical literature of firm tax non-compliance to build the empirical model explored in this paper. Finally, the issue of missing observations or non-response is explored. 12 When the interval thresholds are unknown and have to estimated along with the other parameters, an ordered logit/probit is the preferred estimation framework. 13 In the WBES, over 40% of the firms surveyed indicate that 100% of their sales are reported to the tax authority and between 6-14% of firms appear in one of the other categories. This means that the linear probability models estimated by Batra et al. will be either zero or one inflated. This results in a large number of predicted probabilities falling outside the unit interval. 14 3 Data In 1998, the World Bank Group launched its World Business Environment Survey (WBES). The WBES used many of the same questions from the enterprise survey conducted for the 1997 World Development Report14 (World Bank 1997) but expanded the number of businesses and countries surveyed and the questions/issues covered. During late 1998 and early 2000, face to face interviews15 were conducted with either managers or owners of just over 10,000 firms in eighty countries (plus the West Bank and Gaza). The purpose of the survey was to assess and compare the business environment in a large number of countries. To achieve this goal, the survey gathered information regarding the firm’s characteristics, such as size and ownership structure, as well as responses to multiple questions on the investment climate and the local business environment as shaped by domestic economic policy, governance, regulatory, infrastructural, and financial impediments, as well as assessments of public service quality. A more detailed description of the survey can be found in the Appendix and in Batra et al. (2003). Of interest to this study, the survey included a question regarding the extent and intensity to which firms fail to report income to the tax authority which permits exploring the links between tax non-compliance and various firm, business, governance, and political characteristics. The main advantages of the WBES over audit data include the fact that: (1) audit data are not widely available, unlike the WBES, which covers eighty countries; (2) audit data only include firms that are selected (for diverse reasons) or caught by the tax authorities, while the WBES is a broad 14 Unfortunately, this dataset contains neither detailed information on characteristics of the firms nor the key variable of interest contained in the WBES dataset. As a result, it cannot be appended to the WBES dataset. 15 With the exception of Africa where interviews were predominantly conducted by mail. 15 sample of firms; and (3) the WBES includes additional information that is not included in tax audits. In particular, factors that firms perceive as business obstacles, such as taxes and regulations, which may effect a firm’s decision to under-report, are included. The WBES does not come without disadvantages and these will be discussed below. One disadvantage shared by both the WBES and audit data is that neither data source includes “ghosts” (Erard and Ho 2001); firms that operate solely in cash and avoid normal business obstacles and regulations. The effect of omitting these firms from the estimates will bias estimates of tax noncompliance downwards and this bias will be larger for those countries that have a larger percentage of unregistered firms. 3.1 Dependent Variable As was indicated above, the intent of this paper is to explore the relationships between firm characteristics and tax compliance using firm-level data collected from around the world. How can the WBES be used to investigate firm tax compliance? The WBES asks each firm the question What percentage of total sales would you estimate the typical firm in your area of activity reports for tax purposes? Possible answers: (1) less than 50%; (2) 50-59%; (3) 60-69%; (4) 70-79%; (5) 80-89% (6) 90-99%; and (7) all (100%).16 The variable is thus not expressed in actual quantities as with Johnson et al. (2000). This represents something of a constraint but its key advantage is that the interval-coded nature of the responses is likely to reduce a respondent’s reporting bias. The large number of intervals, most of them of equal size, is beneficial as it provides information on the scale of under-reporting rather 16 In the WBES, the dependent variable is actually coded as follows: What percentage of total sales would you estimate the typical firm in your area of activity reports for tax purposes? Possible answers: (1) all (100%); (2) 90-99%; (3) 80-89%; (4) 70-79%; (5) 60-69%; (6) 50-59%; and (7) less than 50%. The order of the responses is revered solely for expositional clarity. 16 than simply its incidence. Given that the incidence of under-reporting is relatively high, as will be shown below, information on the scale of under-reporting is extremely useful.17 Another disadvantage is that firms are asked about their perceived behaviour of the “typical firm” rather than the behaviour of the firm being interviewed, similar to the wording contained in the survey used though not discussed by Johnson et al. (2000), and this may result in response bias. However, there are several things to consider. It is well known that survey questions on sensitive topics, such as tax non-compliance, are subject to tremendous levels of both response and nonresponse bias (Jones and Darroch Forrest 1990; Lemmen et al. 1988; Mensch and Kandel 1988; and Rogers and Turner 1991). One obvious solution is for the surveyor to provide confidentiality assurances to the respondent but these rarely address all of the concerns of the respondent and, hence, do not yield the desired effect (Rasinski et al. 1999). Another solution is to ask hypothetical questions but such questions usually do not produce clear and consistent data representing real activities when they are ask about situations the respondent has never encountered first hand. By far the most popular solution is to use proxy-responses, where the individual responding to the survey provides information about another individual. This is a technique that had become popular in household surveys where a household member provides responses for household members who are absent. In the case of the WBES, the firm is not asked about the behaviour of a particular firm but of a typical firm in the same area of activity to that of the firm being surveyed. This question is a mix between a hypothetical situation and a proxy response. Based on the phrasing of the question, respondents are likely to base their response, at least in part if not in whole, on their own reporting behaviour, which would minimize the bias. Second, firms will not feel stigmatized by the 17 Further, the interval coding of the data along with the lack of data on the dollar amount of sales for the firms surveyed in the WBES means that the tax gap cannot be calculated. The tax gap is the difference between the taxes paid and the taxes that should have been paid. Such a calculation is useful in determining the amount of tax revenues lost to non-compliance. 17 interviewer and/or fear possible repercussions when responding to the question which again will minimize bias. Finally, firms are likely to base their own behaviour on the perceptions that they have regarding the behaviour of other firms in their area of activity, especially those that they compete with. If they perceive firms in their area activity to be engaging in tax non-compliance this will influence their reporting decision. The distribution of answers to the sales reporting question over the entire sample is given in Table 1a. The first column of the table indicates that just over 18% of the firms in the sample did not respond to the question of interest. It is quite common in survey data to have missing data of this nature. If the data is missing for unknown reasons and the missing data is unrelated to the completeness of the other observations then it is reasonable to exclude the missing observations from the analysis, though efficiency is sacrificed. If, however, the missing data are systematically related to the phenomenon being modeled then there will be a sample selection problem. For example, if firms whose response is missing for the sales reporting question are more likely to be firms who under-report their income, then the missing observations represent more than just missing information. Unfortunately, there is insufficient evidence to definitively ascertain the reasons for the missing observations in this particular case. Rather, the relationship between these missing observations and firm characteristics that will be used as explanatory variables in our model were explored. If the missing observations are randomly distributed amongst these explanatory variables then this provides some evidence that the missing information may not cause systematic bias in the estimates. There appeared to be no significant systematic relationship between nonresponse to the sales reporting question and the explanatory variables , with one exception noted below. 18 Of the firms that responded to the sales reporting question, Table 1b shows that 60% of firms worldwide indicate that the typical firm fails to report their sales in full to the tax authority and, of those firms, over 19% of them fail to report more than half their sales. This shows that business tax compliance is a significant issue. Not surprisingly, there appears to be some differences in perceived tax non-compliance across regions, as is shown in Figure 1. In particular, in OECD countries only approximately 40% of firms are perceived to under-report their sales to the tax authority and of those, approximately 50% fail to report only up to 10% of their sales. Further, compared to other regions, firms in Latin America and Asia perceive that significantly more firms fail to report less than 50% of their sales. In addition, there are significant differences in perceived firm tax compliance across countries. Given these differences, the effect of country specific dummy variables on the results will be investigated in the empirical analysis. There are, however, some differences in the distribution of the missing observations discussed above with respect to region and country (Table 2 lists the countries covered by the survey, sorted by region). Only 11% of firms located in the former Soviet Union, approximately 15% of firms in the OECD and Latin America, approximately 20% in Asian and Africa and the Middle East, and 27% in Transition Europe failed to respond to the sales reporting question. The missing observations appear to be randomly allocated amongst the countries in Latin America and Africa and the Middle East. In Transition Europe, no firm located in Albania responded to the question, indicating that survey was not carried out in a similar fashion to that in other countries rather than a response bias. As a result, Albania is excluded from the analysis (163 observations leaving 9,869). Higher non-response also occurs in Malaysia, India and Thailand in Asia and Russia, Georgia, Kazakhstan, and the Ukraine in the Former Soviet Union but in all cases the nonresponse rate is not dramatically higher from that experienced in other countries in the regions. In 19 the OECD, France, Canada and the United States all have a higher response rate than the other countries in the group but again the difference is not alarming. 3.2 Explanatory Variables As was outlined above, previous empirical work using audit data found relationships between tax non-compliance and various firm characteristics and these relationships will also be explored in this paper. The firm characteristics that will be included as explanatory variables include dummy variables for: industry sector, size, age, ownership, export status, whether the firm’s financial statements are audited, legal ownership, and number of competitors. In the secondary analysis, the effect of various perceived business obstacles on tax compliance will be investigated, including access to capital, taxes, and regulations, organized crime, government corruption, inflation, and exchange rates. As the sample size decreases due to non-response with the inclusion of these latter explanatory variables, they are only included in a secondary regression. 3.2.1 Legal Status The previous literature makes it clear that tax non-compliance likely varies across governance models. The WBES provides an opportunity to explore the relationship between the legal structure of the business and tax non-compliance. Firms in the WBES are categorized as being either: sole proprietorships (18%), partnerships (17%), cooperatives (3%), private corporations (28%), public corporations (9.6%), or other (15%).18 Dummy variables for business type need to be created if this characteristic is to be included in the analysis. Few firms in the 18 Fewer than 9% (836) of the firms in the sample did not respond to this question. 20 sample indicated that their business type is a cooperative. Consequently, these firms will need to be included in one of the other categories. As cooperatives are accountable to their members, cooperatives are more comparable to public corporations than any of the other business structure. Cooperatives are, therefore, included with public corporations. The remaining governance categories are treated individually. Based on the literature, it is expected that publicly traded corporations will be the most compliant. In most countries, publicly traded corporations have a greater probability of being audited and are subject to public disclosure requirements and independent financial auditing, which tends to expose any under-reporting behaviour to the authorities. As audit levels and detection probability are greater, compliance should be higher. 3.2.2 Sector Many studies have found that tax non-compliance varies widely across industry sectors. In the WBES, firms are categorized into one of five possible broadly defined industry sectors.19 They are: services (33%), agriculture (39%), construction (6%), manufacturing (9%), and other (3%). Dummy variables for industry sector need to be created if this characteristic is to be included in the analysis. These industry categories in the WBES are likely too broadly defined to obtain any significant relationship between industry sector and tax non-compliance. For example, the service sector will include restaurants and hair salons as well as law and accounting firms. As a result, no significant relationship may be found between industry sectors and under-reporting. 19 Exactly 9% (888) of the firms in the sample did not respond to this question. 21 3.2.3 Size Rice (1992), Giles (2000), and Hanlon et al. (2005) each investigated the relationship between a firm’s size and tax non-compliance. Rice (1992) defined a firm’s size according to the dollar value of assets it held and found a positive relationship between firm size and tax noncompliance, whereas Giles (2000) used a firm’s sales revenues and before tax-profits, and found a negative relationship between firm size and tax non-compliance. Using the same definition as Rice, Hanlon et al. (2005) found a convex relationship between firm size and non-compliance. None of these measures of firm size are available in the WBES. Instead, this paper investigates a firm’s size as defined in terms of the number of employees.20 A small firm is one with fewer than 50 employees (40%), a medium-sized firm has between 50 and 500 employees (40%), and a large firm has over 500 employees (20%). The relationship between firm size and tax compliance remains unclear but it is not unreasonable to assume a negative relationship between firm size, as defined in the WBES, and tax non-compliance. 3.2.4 Competition While not explicitly investigated in the previous literature, it is possible that there is a positive relationship between the number of competitors in a given market and tax non-compliance. For example, firms in highly competitive markets may resort to tax non-compliance in order to reduce costs, set a lower price for their goods, and/or increase profits. The WBES asks firms how many competitors they face in their market.21 Possible answers are: none (11%), between one and three (37%), and more than three (48%). 20 21 Only ¼ of 1% (25) of firms failed to respond to this question. Fewer than 4% of firms in the sample did not respond to this question. 22 3.2.5 Age The WBES includes information on the firm’s age22. Specifically, if the firm is less than five years old (26%), between five and fifteen years old (35%), or greater than fifteen years old (39%). The effect of a firm’s age on tax compliance is ambiguous. For example, younger firms may be less compliant because they may have more competitors, may be struggling to turn a profit, and/or may view tax evasion as a way to cut costs. On the other hand, young firm may be more reliant on external financing, which could cause them to be more compliant. 3.2.6 Other Characteristics Other firm characteristics include indicators of whether the firm is foreign owned (18%), government owned (12%), an exporter (34%), and whether the firm subjects its financial statements to audits (58%).23 All of these indicators are expected to be positively related to compliance as found in the previous literature. 3.2.7 Perceived Obstacles In addition to recording information regarding the firm’s characteristics, the WBES also asks a number of questions about the firm’s perceptions of various constraints in the business environment, which likely influence operational decisions including tax compliance. While many of these variables have a low response rate, several have a relatively higher response rate.24 In particular, firms are asked if the following items present an obstacle or are a constraint to conducting business: access to capital, inflation, exchange rate, organized crime, political 22 All firms responded to this question. Non-response among these questions if very low (fewer than 5%). 24 Taking account of these explanatory variables reduces the number of available observations to 5.393. 23 23 instability, taxes, regulations, and corruption. If these issues are perceived by the firm as affecting their ability to conduct business, then they may result in the firm engaging in tax non-compliance in order to reduce costs and be more competitive. Andreoni (1992) argues that “…individuals facing binding borrowing constraints may use tax evasion to transfer resources from the future to the present. Even if a person finds tax evasion undesirable in the absence of borrowing constraints, it could become desirable if a borrowing constraint is binding. Tax evasion, therefore, may be a high-risk substitute for a loan.” (Andreoni 1992, 35-36). Johnson et al. (2000) note that under-reporting is likely increased if the economy has a poorly developed banking sector. Hence, if access to capital is an obstacle, the firm may underreport. As noted by Giles and Tedds (2002), price inflation may influence under-reporting in two ways. First, if tax rates are not indexed then the tax burden rises simply due to “bracket creep”. Second, inflation generates uncertainty and “…businesses may hedge against this uncertainty by engaging in (more) underground activity.” (p. 11) They also note that the exchange rate can influence under-reporting. Exchange rates can be an obstacle to doing business if the exchange rate is unpredictable and can increase the input costs faced by businesses. Various governmental factors can also influence under-reporting. Political instability constrains a countries ability to collect taxes and also reduces the efficiency of collection, thereby increasing non-compliance. It is generally acknowledged in the tax non-compliance literature that the presence of corporate taxes provides incentives for firms to hide output to reduce their tax burden (Schneider and Enste 2000). The degree of regulation is often cited as a factor that influences people to engage in underground activity as regulations reduce a firms’ freedom of choice (Brehm 1966; Deregulation Commission 1991; Monoplkommission 1998; Schneider and 24 Enste 2000). Johnson et al. (1998a) find a positive correlation between an index of regulations and the underground economy and argue that governments should focus on reducing the density of regulations. In the WBES, firms are asked if: (1) it is common to pay some “irregular” additional payments to government officials; and if (2) laws and regulations that affect the firm are interpreted inconsistently by the government or courts. If corruption is common, then among other effects, it will increase the cost of business, reduce morality, and reduce a firm’s confidence in government; all of which are likely to have a negative relationship with tax compliance. Johnson et al. (2000) and Johnson et al. (1998b) find a positive and significant relationship between extortion by government officials and under-reporting. The relationship between tax compliance and inconsistency in the application of laws and regulations is more ambiguous. If a firm can individually garner the favour of the government(s) and/or courts in the interpretation and application of laws and regulations, then this may reduce tax non-compliance. On the other hand, if the firm does not benefit from this inconsistency, then it may resort to tax non-compliance. 3.3 Data Summary Statistics Since no obvious explanation is available for the missing observation on the sales reporting questions, these observations (1,716) were dropped from the data set leaving 8,153 observations. Table 3 provides summary statistics for the key variables described above. 4 Empirical Framework The survey question which forms the basis for the dependent variable and is described above refers to categories. As a result, a firm’s reporting behaviour is not directly observed. 25 Rather, firms are categorized on the basis of the percentage of sales that go unreported. When a quantitative outcome is grouped into known intervals on a continuous scale, the data is said to be “interval-coded”. An ordered probit (or logit) is ideal when the dependent variable is discrete, ordinal in nature, and when the categories or thresholds are unknown.25 In this case, the thresholds are estimated along with the model’s coefficients and the variance of the error term is normalized to be one. It is possible, however, to modify the ordered probit model so that the thresholds are fixed at their known values and only the coefficients and the error variance are estimated. This estimation procedure is known as “interval regression” and is undertaken using maximum likelihood techniques. The key advantage of interval regression over the ordered probit is that it provides an asymptotically more efficient estimator as it uses the known threshold information and involves estimating fewer parameters. It is also preferred to OLS, as OLS on the grouped dependent variable model is inconsistent.26 The following is the general setup of the model and is based on the discussion contained in Stewart (1983). The responses for the dependent variable are coded 1, 2, 3, 4, 5, 6, and 7 to capture seven distinct sales under-reporting categories. Let yi denote the observable ordinal variable coded in this way for the ith firm and let y i* denote the underlying variable that captures the sales under-reporting of the ith firm. This can be expressed as a linear function of a vector of explanatory variables xi using the following relationship: 25 That is, for an ordered logit/probit model, while it can be said that two is greater than one, it cannot be determined if the difference between two and one is somehow twice as important as the difference between one and zero. The latter is not true of interval coded data. 26 Another popular alternative is to define an artificial dependent variable that takes the value of the midpoint of the reported interval and then estimate the model by OLS. The resulting estimates are inconsistent but can be transformed using a minimum χ2 method and the transformed estimates are consistent. However, the transformation requires equally strong distributional assumptions, usually that of normality and homoskedasticity, to that required of the interval regression model. 26 yi* = xi' β + ui , ui ~ N (0, σ 2 ). (1) It is assumed that y i* is related to the observable ordinal variable yi as follows: y i = 1 if y i* ≤ 49% y i = 2 if 49% < y i* ≤ 59% y i = 3 if 59% < y i* ≤ 69% y i = 4 if 69% < y i* ≤ 79% (2) y i = 5 if 79% < y i* ≤ 89% y i = 6 if 89% < y i* ≤ 99% y i = 7 if y i* > 99% Based on the above, the seven possible components of the general log likelihood function for the ith individual is expressed as: Li = I [ y i = 1] × log e {Φ ( Φ( 49 − xi' β σ log e {Φ ( 49 − xi' β σ ) − Φ( a 0 − xi' β )} + I [ y i = 3] × log e {Φ ( 79 − xi' β ) − Φ( 69 − xi' β σ 69 − xi' β σ )} + I [ y i = 2] × log e {Φ ( ) − Φ( 59 − xi' β σ )} + I [ y i = 5] × log e {Φ ( 59 − xi' β σ )− )} + I [ y i = 4] × 89 − xi' β σ σ σ ' ' ' 79 − xi β 99 − xi β 89 − xi β Φ( )} + I [ y i = 6] × log e {Φ ( ) − Φ( )} + σ σ σ a K − xi' β 99 − xi' β I [ y i = 7] × log e {Φ ( ) − Φ( )} σ σ )− (3) where Φ denotes the cumulative distribution function of the standard normal, Φ (−∞ ) = 0 , Φ (∞ ) = 1 , log e (·) denotes the natural logarithmic operator, and I[·] is an indicator function that takes the value of one when the statement in the square brackets is true and zero when it is false. 27 The relevant part of the log-likelihood is then triggered by the indicator function for whether the individual falls within one of the seven intervals in question. The maximum likelihood procedure now involves the estimation of the β parameter vector and the ancillary standard error parameter σ. The question then becomes how the end intervals should be treated. That is, what are the values for a0 and aK noted in equation 3? Stewart (1983) treats the first and last intervals as open ended. In this paper, it is difficult to argue that a firm would report less than 0% of their sales. Instead, it could be argued that the first interval should begin at either 1 or 0%: 1% since it can be argued that firms that report 0% of their sales would likely be operating completely as “ghosts” and would not have been selected for an interview since there would be no formal record of the firm. It could also be argued that the last interval should be treated as close-ended at 100%. However, treating the last interval as open ended accounts for possible sales over-reporting. Rice (1992) found that about 6% of all corporations overstate their taxable income to some extent. Firms may over-report due to a misinterpretation of tax laws, to avoid a tax audit, to secure financing, or particularly for public corporations, to appear more competitive. In the WBES, firms that overreport will likely be included in the full compliance category. The effect of choosing different values for a0 and aK on the resulting estimates will be explored. It should be noted that the sensitivity of the results and predicted values to the treatment of the end intervals is often overlooked in the literature yet can be of clear importance to the economic validity of the resulting estimates. Unlike the situation with the ordered probit estimation, the estimated coefficients from an interval regression are interpretable as if y i* is observed for each i and estimated E ( y * | x) = xβ by OLS. That is, the estimated coefficients can be interpreted as the marginal effects (i.e. the change in percentage of sales reported given a change in the independent variable, holding all else 28 constant). It should be noted that the estimates contained in the β parameter vector are only interpretable in this way due to the assumption that y* given x satisfies the classical linear model assumptions. If these assumptions do not hold then the interval regression estimator of β would be inconsistent. As a result, it is important to test the key assumptions of functional form, homoskedasticity, and normality, all of which are used to derive the log likelihood function noted in equation (3). Unfortunately, these assumptions, despite their importance, are seldom tested in applied research. Equally, these assumptions, particularly homoskedasticity and normality, are rarely met when using micro data. 4.2 Diagnostic Tests Chesher and Irish (1987) outline diagnostic tests for (pseudo) functional form, normality and homoskedasticity for the ordered probit model that are easily modified for the interval regression model. These tests are all score (or Lagrange Multiplier) tests for which the test statistics take the form ξ = 1' F ( F ' F ) −1 F '1 (4) where 1 is an n-dimensional vector of ones, and F is a matrix with row order n where each row contains the score contributions for all the parameters of the model. ξ can be easily calculated as n times the non-centered R2 from a regression of 1 on the columns of F. The construction of the F matrices for these tests, which are described below, are based on computations of the pseudo-residuals. Usually, residuals are defined as the difference between the observed and estimated values of the dependent variable. However, the estimated values of the dependent variable obtained in the interval regression have no counterpart in the data. Rather, 29 pseudo-errors need to be computed. An expression for the pseudo-errors is obtained by differentiating equation (3) with respect to σ and denoted for the ith individual as: φ( ui = a ( j −1)i − xi' β σ [Φ ( σ a ji − xi' β σ ) − φ( ) − Φ( a ji − xi' β σ ) a( j −1) i − xi' β σ (5) )] where φ (⋅) denotes the probability density function for the standard normal, and aj-1 and aj denote the known interval parameters for individual i (e.g. if yi=2 then aj-1=49 and aj=59). The pseudoresiduals, ei, are obtained by replacing the unknown parameters in (5) with their maximum likelihood estimates. For the homoskedasticity and non-normality tests, higher-order moment residuals are required, specified as: M τi = ( a( j −1)i − xi' β σ )τ φ ( σ [Φ ( a ( j −1)i − xi' β σ a ji − xi' β σ )−( ) − Φ( a ji − xi' β σ )τ φ ( a ( j −1)i − xi' β σ a ji − xi' β σ ) (6) )] The higher-order moment residuals are obtained by replacing the unknown parameters in (6) with their maximum likelihood estimates. The first four moment residuals are required for the desired tests and are defined as follows: ei1 = ei e 2 = Mˆ i 1i e = 2e + Mˆ 2i e 4 = 3e 2 + Mˆ 3 i 1 i i i (7) 3i The F matrix, or score contributions, is obtained by multiplying the pseudo-residuals by the various auxiliary variables in question. 30 It should be noted that if either or both of the assumptions of normality and homoskedasticity are rejected, then the Huber (1967) “sandwich” estimator of the variance can be used in place of the conventional Maximum Likelihood variance estimator. This estimator is expressed as: Var ( βˆ ) = [ I ( βˆ )]−1 ( xi' ei2 xi )[ I ( βˆ )]−1 (5) where I ( βˆ ) is the information matrix for the βˆ vector, computed at the maximum likelihood estimates. However, unlike with linear regression, in an interval regression model is it not possible to correctly specify E ( y * | x) but misspecify Var(y*|x) which implies that the validity of this correction is highly questionable in the interval regression model and, hence, is not used in this paper. While it is worthwhile considering the results of the aforementioned tests, care should be exercised when interpreting the associated results. Orme (1990) has questioned the use of such score tests in the context of a simple binary probit and demonstrated their poor finite sample properties in this setting. In particular, he notes that there is upward size-distortion. Assuming that these findings extend to the interval regression model, the diagnostic tests may indicate that the model does not satisfy the classical linear model assumptions when in fact it does. However, the sample size for the data used in this paper is large enough that this concern likely does not apply. 4.2.1 Pseudo-Functional Form Test The (pseudo) functional form test is a modified version of the RESET test (Ramsey 1969). F is given as F = (e1 x, e1 yˆ *2 ,..., e1 yˆ *K , e 2 ) (8) 31 where x includes a column of 1’s if the grouped model contains an intercept and yˆ *K is the Kth power of model’s predicted standardized index, yˆ = * x ' β̂ σ . That test statistic ξ is distributed as χ 2 ( K − 1) . If the null hypothesis is rejected, this is evidence of model misspecification that may be rectified by considering additional variables and/or alternative functional forms (such as a semilog model). 4.2.2 Test for Homoskedasticity For the test of homoskedasticity, F is given as F = (e1 x, e 2 d i ) 27 (9) The test statistic ξ is distributed as χ 2 (q ) where q is equal to the number of variables that are interacted with e2. If the form of the heteroskedasticity is unknown, then d i = xx' . This is simply White’s test for heteroskedasticity of unknown form, modified for the interval regression model. The test is sensitive to model misspecification so if the test for (pseudo) functional form is rejected, it is also likely that the test for homoskedasticity will also be rejected. As indicated previously, the estimated coefficients are inconsistent in the presence of uncorrected heteroskedasticity but it is possible to address this inconsistency by modeling the heteroskedasticity. The preferred choice for the functional form for the heteroskedasticity is a variation of Harvey’s (1976) multiplicative heteroskedasticity, denoted as Var(ui)=exp(diγ)2 where di is a matrix of independent variables, including a column of one’s, that are the source of the 27 When an intercept is estimated so that x always contains a unit element, e2 is redundant in the test for homoskedasticity. 32 heteroskedasticity and γ is a coefficient vector. The expression exp(diγ) replaces the σ noted in equation (3).28 4.2.3 Non-Normality Test Finally, F in the usual χ 2 (2) test for zero skewness and/or excess kurtosis is given by F = (e1 x, e 2 , e 3 , e 4 ) (10) It is anticipated that the assumption of normality is likely violated in this model. Table 1b clearly shows that the distribution of sales reporting is negatively skewed. Normality may be more closely approximated by taking the natural logarithm of 100 minus the dependent variable, which is simply the natural logarithm of the percentage of sales not reported to the tax authority. Such a transformation requires that the dependent variable take only strictly positive values and care exercised in interpreting the resulting estimates. This is a transformation that can also be explored if the model fails the functional form test. It should also be acknowledged that the presence of uncorrected heteroskedasticity can also lead to a rejection of the null of normality. Correcting for the heteroskedasticity, as noted above, can often achieve normality in the (pseudo) residuals. 5 Results Estimation was undertaken using STATA 9.2 and the interval regression model was estimated using the ‘intreg’ command. The command needs two variables, denoted y1 and y2, to define the dependent variable. In particular, y1 and y2 are used to hold the endpoints of the interval. 28 In STATA, the heteroskedastic corrected interval regression model is estimated using the ‘het’ option on the ‘intreg’ command. 33 Table 4 provides the concordances between yi, the associated interval, y1, and y2. The model for the analysis is: y i* = α + β 1 PROPi + β 2 PARTNERi + β 3OTHERBUS i + β 4 PRIVCORPi + β 5 SERVICEi + β 6 AGRi + β 7 CONSTRUC i + β 8 OTHERINDi + β 9 SMALLi + β10 LARGE i + β 11 LESS 5 i + β 12 OVER15 i + β 13 NOCOMPETi + β14 MORE 3COMPETi + β 15 FOREIGN i + (11) β 16 GOVOWN i + β 17 EXPORTi + β 18 AUDITi + β 19 FIN i + β 20 INSTABILE i + β 21ORGCRIMEi + β 22TAX i + β 23 CORRUPi + β 24 INFLATi + β 25 EXCHANGE i + β 26 LAWS i + 106 ∑ β COUNTRY +u , K = 27 j i i = 1....N Table 3 provides a description of the above noted variables and Table 2 denotes the countries available in the data. The model is estimated both with and without the perception variables. This produces a total of two possible models to consider. As indicated above, it is necessary to specify the values of the two end intervals. The values of -•, 0, and 1 are considered for the values of a0, and 100 and • are considered for the values of aK. The results of these models, twelve in total, are presented in Table 5. Models 1 through 6 do not include the perceptions variables while Models 7 through 12 do include the perception variables. The treatment of the end points is noted in the heading of column. For example, Model 1 and 7 treat the first and last intervals as open ended. In all models the constant represents the reporting behaviour of the baseline firm. The baseline firm is a public corporation, in the manufacturing industry, has between 50 and 500 employees, is between five and fifteen years of age, is a domestic, non-government owned, nonexporter, has between one and three competitors and does not have it financial statements audited. According to Model 1 through 3, the baseline firm reports just over 90% of its sales to the tax authority and this falls to just over 80% for Model 4 through 6. For models 7 through 9, this jumps 34 to an implausibly high estimate of approximately 110% but falls to a more realistic number of approximately 91% for Models 10 through 12. Models 1 through 3 and 6 through 9 all treat the last interval as open ended and it is important to investigate the implications that this has on the predicted values, ŷ * . The mean, standard deviation, minimum, maximum and number of predicted values in excess of 100% are also reported in Table 5. In Models 1 through 3 and 6 through 9, the maximum predicted value is in excess of 140% and over 25% of the predicted values exceed 100%. These numbers seem economically implausible, particularly that over 25% of firms report in excess of 100% of their sales to the tax authority. Hence, Models 1 through 3 and 6 through 9 are ruled out. This leaves Models 4-6 and 10-12 for consideration. In these models, the maximum predicted value is approximately 105% and approximately 1% of the predicted values exceed 100%. These figures seem to be economically plausible and accord with the findings of Rice (1992) who found that some firms do over-report to some extent. A comparison of the estimates across Models 4-6 and 10-12 show little variation. Various goodness of fit measures are also presented in Table 5 for the relevant models. Larger (less negative) log-likelihood values are indicative of a better fit. The log-likelihood values can be compared across the models if they have the same samples and dependent variable. The R-square, for technical reasons, cannot be computed in the same way in interval regressions as it is in OLS regression. Various pseudo Rsquare measures, however, have been proposed, but there is no generally accepted measure. Veall and Zimmermann (1996) recommend the measure of McKelvey and Zavoina (1975), which is reported in Table 5.29 Again, a comparison of the goodness of fit estimates across the models show little variation. Given the similarities across Models 4-6 and 10-12, it is difficult to eliminate 29 The McKelvey and Zavoina (1975) R-square is computed in STATA with the 'fitstat' command. 35 further models based solely on the goodness of fit measures. Since Models 5 and 11 constrain the unobserved dependent variable to range between 1 and 100, which seems economically reasonable, Models 5 and 11 are selected for further consideration. The diagnostic test results for the preferred models, Models 5 and 11, are presented in Table 5. For model 5, the null of correct functional form is not rejected at a 1% significance level but rejected at all others. All other tests are rejected at the usual significance levels. This is concerning since it implies that the estimates are inconsistent. As mentioned above, the dependent variable is negatively skewed. Since the preferred models each constrain the dependent variable to be strictly positive, a possible transformation to consider is to transform the dependent variable into the percentage of sales not reported to the tax authority, calculated as 100% minus the threshold parameters. This newly formed variable is positively skewed and can be smoothed by taking the natural logarithm.30 The results from these models are presented in Table 6. The coefficients cannot be directly compared to those reported in Table 5 due to the transformation of the dependent variable. That is, the coefficients are no longer marginal effects but 100 multiplied by the estimated coefficient represents the percentage change.31 Finally, as the dependent variable is now the natural logarithm of the percentage of sales not reported to the tax authority, the signs on the estimated coefficient should be the opposite of those in Table 5, where the dependent variable was the percentage of sales reported to the tax authority. 30 Since taking 100 minus the observed threshold parameters results in a value of 0 for those firms that report 100% of their sales, a small constant is added to transformation such that the newly formed variable is strictly positive everywhere. 31 Kennedy (1981), however, notes that in a semi-log model the coefficient on a dummy variable must be appropriately transformed. That is, if bi is the estimated coefficient on a dummy variable and V(bi) is the estimated variance of bi then g i = 100(exp(bi − V (bi ) ) − 1) 2 (12) gives an estimate of the percentage impact of the dummy variable on the dependent variable. The effect of this transformation on the resulting coefficients was explored. As the resulting change to the coefficients reported in Table 6 were marginal, the results are not reported but are available from the author upon request. 36 The columns Model 5a and Model 11a in Table 6 present the results for the preferred Models from Table 5 using the transformed dependent variable. For both models, the null of correct functional form is not rejected at all the usual significance levels, which represents significant improvement over the models presented in Table 5, but all other tests are rejected at the usual significance levels. Figure 2 presents plots of the pseudo residuals from these two models which support the finding of non-normality and it is not clear if any further transformations would result in normality. In addition, further transformations would make it difficult to interpret the estimated coefficients. It is, however, possible to correct for the heteroskedasticity, as noted above, if a form for it is specified. It is not unreasonable to assume that much of the heteroskedasticity results from countries and business type. Applications of the test for homoskedasticity of known form across subsets of the variables confirm this suspicion. Models 5a and 11a are re-estimated to account for the known form of the heteroskedasticity and the results are also presented in Table 5 under the column heading Model 5b and 11b. These results will now be discussed. The results for Model 5b will be discussed first. Sole proprietorships and private corporations are the least compliant form business of all the business models included in this study. This result accords with those found by Rice (1992), that found that public corporations were more compliant than other types of firms, and Hanlon et al. (2005) who found that private corporations were less compliant. Small firms, and firms with more than three competitors are also found to be the least compliant while large firms and firms with no competitors are more compliant. Giles (2000) reports a similar result with respect to size, whereas Rice (2002) found firm size and tax non-compliance were positively related and Hanlon et al. (2005) found that compliance was a convex function over firm size. The results with respect to competition can be interpreted in several ways. First, firms in highly competitive markets may resort to tax non-compliance to 37 reduce costs and allow the firm to set a lower price for their goods and/or increase the firm’s profits. Second, firms base their behaviour on the behaviour of firms that are in the same area of activity as themselves. This lends credence to using the firms response regarding the reporting behaviour of a typical firm in their area of activity as a proxy for the firms own behaviour. Finally, being an established firm, greater than fifteen years of age slight increases compliance and being foreign or government owned, and having financial statements audited all lead to significantly higher compliance. Both Giles (2000) and Chan and Mo (2000) found that foreign owned firms are more compliant, and the results in Table 6 provide further support for this result. The relationship between internal audit controls of the firm and tax compliance has not been investigated in previous empirical work. Numerically, the percentage of sales not reported to the tax authority is: 22.5 percent higher for sole proprietorships and 9.4 percent higher for private corporations; 17.9 percent higher for small firms and 13.1 percent lower for large firms; 7.9 percent lower for firms greater than fifteen years of age; 17.5 percent lower for firms with no competitors but 11.6 percent higher for firms with more than three competitors; 17.3 percent, 11.7 percent and 19.2 percent lower for foreign owned firms, government owned firms, and firms with audited financial statements, respectively. Chan and Mo (2000) report that export oriented firms are more compliant but this characteristic is found to be insignificant. Previous empirical studies all found a significant relationship between industry and tax non-compliance whereas no significant effect is found in this specification. As discussed above, the categories included in the WBES data set are likely too broadly defined to yield any significant results. The results for Model 11b, which includes the perception variables, are similar to those reported for Model 5b with most being only slightly larger. The key exceptions are: the coefficient 38 on sole proprietorships, which is over 1.5 times higher at 34 percent; the coefficient on the service industry which is now statistically significant and suggests that the percentage of sales not reported by firms in the service industry is 8.6% lower; the coefficient for firms greater than fifteen years of age is no longer statistically significant while the coefficient for firms less than five years of age is now statistically significant and suggests that they are more compliant, reporting 5.7% more of their sales; the coefficient on more than three competitors is no longer statistically significant; and the coefficient for foreign owned firms is also no longer statistically significant. With respect to the perception variables, firms that perceive high taxes and regulations, and government corruption as obstacles to doing businesses report significantly less of their sales to the tax authority. The results presented in Table 6 indicate that government corruption has the single largest effect, resulting in the percentage of sales not reported to the tax authority being a whopping 53.4 percent higher and taxes at 20.2 percent higher. In comparison, Johnson et al. (2000) also found a positive relationship between non-compliance and government corruption but failed to find a relationship between compliance and tax payments. Exchange rates have a positive effect on compliance, which is the only coefficient that fails to follow a priori expectations. All other variables are statistically insignificant. 7 Conclusion Very little is actually known about firm tax compliance due to a lack of detailed and readily available data. The purpose of this paper was to use a unique and recently available dataset that contained information on firms from around the world to investigate some of the factors that effect business tax compliance. This is one of the first studies to examine firm tax compliance using 39 worldwide data. The majority of previous empirical studies were confined to examining firms within a particular country using tax audit data. The empirical strategy employed in this paper exploits the nature of the dependent variable, which is interval coded, and uses interval regression which provides an asymptotically more efficient estimator than the ordered probit. In addition, the estimated coefficients from an interval regression are interpretable as marginal effects provided that the model satisfies the assumptions of correct functional form, homoskedasticity, and normality. If these assumptions do not hold then the interval regression estimator of β is inconsistent. These assumptions are investigated using standard diagnostic tests that have been modified for the interval regression model, a step frequently ignored in applied research. The test results support the use of the semi-log model that is corrected for heteroskedasticity, the origin of which is found to result at the country level and from the business type. Overall, evidence is presented that shows that firms in all regions around the world engage in tax non-compliance, but that there is substantial variation within regions. The detailed results indicated that the legal organization of a business does affect tax compliance, with sole proprietorships and private corporations being the least compliant business type. Firm size is also an important indicator of tax compliance, with small firms reporting less and large firms more of their sales to the tax authority. Firms with no competitors, government owned firms and firms with audited financial statements are significantly more compliant. Industry and firms that export is found to have no significant effect on compliance and the effect of firms age on compliance varies with model specification. Whether or not firms with more than three competitors and firms that are foreign owned are more compliant depends on the specification of the model as is also the case with firm age. Not surprisingly, taxes and government corruption are positively related to tax non40 compliance and the magnitude is quite large in both cases. Access to capital, political instability, organized crime, inflation, and the legal system are found to have no statistically significant effect on tax non-compliance. The findings do suggest a role for public policy, as well as actions to be considered by the tax authority and items that require further study. First, the findings suggest that administrations interested in reducing business tax non-compliance should consider reducing business taxes, minimizing the number of regulations, and reducing, if not eliminating, government corruption. Admittedly, taking action of these issues is complex and involves more than just the tax authority. Second, tax authorities should consider auditing sole proprietorships, private corporations, small firms at a higher rate and requiring all firms to have their financial statement audited by a third party. Since firms with no competitors are found to be significantly more compliant and firms in highly competitive sectors are found to be less compliant, governments will have to weigh the economic efficiency gains from encouraging market competition over the clear lost in tax revenues that this causes. Finally, based on this study, it is not entirely clear why large firms, firms that are foreign owned, and how exchange rates lead to higher tax compliance. Further exploration into these relationships appears to be a worthwhile venture. It is also worthy to consider the empirical strategy employed in this paper. The dependent variable in this model is a fractional variable, an index, that takes values between zero and one. The estimation strategy employed in this paper assumes that the marginal effects are the same throughout the index. Papke and Wooldridge (2005) have extended a fractional response model that accounts for responses occurring at the zero and one corners and “…ensures that the fitted values are between zero and one and that the marginal effect are diminishing as the index argument 41 increases.” (p. 2) This and other fractional response models have not yet been extended to intervalcoded data but the consideration of such an extension is warranted. 42 TABLES Table 1a: Univariate Frequencies of Percentage of Sales Reported to Tax Authorities, including missing observations Missing <50% 5060708090100% 59% 69% 79% 89% 99% Frequency 1,879 Percent 18.73% Observations 10,032 936 694 501 703 916 1,096 3,307 9.33% 6.92% 4.99% 7.01% 9.31% 10.93% 32.96% Table 1b: Univariate Frequencies of Percentage of Sales Reported to Tax Authorities, excluding missing observations <50% Frequency Percent 50-59% 60-69% 70-79% 80-89% 90-99% 100% 936 694 501 703 916 1,096 3,307 11.48% 8.51% 6.14% 8.62% 11.24% 13.44% 40.56% Observations 8,153 43 Table 2: Countries Surveyed, Categorized by Region, and Number of Observations in Each Country Country Observations Africa and Middle East Botswana Cameroon Côte d'Ivoire Egypt Ethiopia Ghana Kenya Madagascar Malawi Namibia Nigeria Senegal South Africa Tanzania Uganda West Bank and Gaza Zambia Zimbabwe Total Asia Bangladesh Cambodia China India Indonesia Malaysia Pakistan Philippines Singapore Thailand Total OECD Canada France Germany Italy Portugal Spain Sweden United Kingdom United States1 Total Notes: 1 Country 87 45 66 102 77 85 89 84 42 66 73 85 99 66 106 68 64 102 1,456 39 239 85 160 72 50 80 90 88 369 1,272 95 89 77 81 84 89 82 79 94 770 Observations Transition Europe Bosnia and Herzegovina Bulgaria Croatia Czech Republic Estonia Hungary Lithuania Poland Romania Slovak Republic Slovenia Turkey Total Former Soviet Union Armenia Azerbaijan Belarus Georgia Kazakhstan Kyrgyzstan Moldova Russia Ukraine Uzbekistan Total Latin American and Caribbean Argentina Belize Bolivia Brazil Chile Colombia Costa Rica Dominican Republic Ecuador El Salvador Guatemala Haiti Honduras Mexico Nicaragua Panama Peru Trinidad and Tobago Uruguay Venezuela Total 104 97 112 106 126 121 29 208 125 24 123 124 1,299 111 114 116 99 102 124 116 481 188 118 1,569 83 33 87 172 94 92 80 90 85 95 86 95 85 80 86 80 102 93 87 82 1,787 Denotes the omitted category in estimation. 44 Table 3: Data Summary Variable % of Sales Reported Sole Prop. Partnerships Private Corp Public Corp. 1 Other Business Manufacturing1 Service Other Agriculture Construction Small Medium1 Large <5 5-151 >15 No Competitors 1-31 >3 Foreign Owned Gov. Owned Exporter Fin. Statements Audited Obs. Acronym Standard Deviation Minimum Maximum 8153 REPORT 2.978 Legal Organization of Company 8153 PROP 0.180 8153 PARTNER 0.173 8153 PRIVCORP 0.280 8153 PUBCORP 0.130 8153 OTHER 0.154 Industry Sector 7418 MANUFAC 0.370 7418 SERVICE 0.424 7418 OTH 0.035 7418 AGR 0.075 7418 CONSTRUC 0.096 Firm Size 8139 SMALL 0.393 8139 MED 0.412 8139 LARGE 0.194 Firm Age 8153 LESS5 0.257 8153 5TO15 0.354 8153 OVER15 0.389 Number of Competitors 7907 NOCOMPET 0.111 7907 1TO3 0.399 7907 MORE3COMPET 0.490 Other 8153 FOREIGN 0.183 8153 GOVOWN 0.120 8153 EXPORT 0.339 8153 AUDIT 0.586 Secondary Parameters – Perception 7526 FIN 0.811 2.166 1 7 0.384 0.378 0.449 0.336 0.361 0 0 0 0 0 1 1 1 1 1 0.483 0.494 0.185 0.264 0.294 0 0 1 1 0 0 1 1 0.488 0.492 0.396 0 0 0 1 1 1 0.437 0.478 0.487 0 0 0 1 1 1 0.314 0.490 0.500 0 0 0 1 1 1 0.387 0.325 0.473 0 0 0 1 1 1 0.493 0 1 0.391 0 1 0.842 0.365 0 1 0.631 0.482 0 1 0.892 0.586 0.846 0.741 0.311 0.497 0.361 0.438 0 0 0 0 1 1 1 1 0.478 0.493 0 1 Financing Political 7376 INSTABILE Instability Organized Crime 7196 ORGCRIME High Taxes & Regulations 7660 TAX Corruption 7485 CORRUP Inflation 7425 INFLAT Exchange Rate 7259 EXCHANGE Laws & Regs In consistent 8029 LAWS Notes: 1Denotes the omitted category in estimation. Mean 45 Table 4: Definition of the Dependent Variable for the Interval Regression Model yi 1 2 3 4 5 6 7 Associated Interval (ao, 49] (49, 59] (59, 69] (69, 79] (79, 89] (89, 99] (99, aK] y1 y2 ao 49 59 69 79 89 99 49 59 69 79 89 99 aK Notes: § If: ao=-•, then y1=.; ao=0, then y1=0; and ao=1, then y1=1. Equally, if aK=•, then y2=.; and ak=1000, then y1=100. 46 Table 5: Estimation Results Variable Model 1 -•<Yi*<• Model 2 0<Yi*<• Model 3 1<Yi*<• Model 40£Yi*£100 Model 5 1£Yi*£100 Model 6 -•<Yi*£100 Model 7 -•<Yi*<• Model 8 0<Yi*<• Model 9 1<Yi*<• Model 100£Yi*£100 Model 11 1£Yi*£100 Model 12 -•<Yi*£100 Constant 90.279 (3.709)* -5.922 (1.746)* -2.417 (1.644) -1.044 (1.830) -3.100 (1.423)** -1.101 (2.466) 1.371 (0.976) 1.196 (1.692) -1.357 (1.487) -4.156 (1.031)* 3.454 (1.210)* 1.100 (1.045) 1.281 (1.050) 3.962 (1.475)* -3.433 (1.222)* 4.846 (1.143)* 2.904 (1.510)*** 90.315 (3.587)* -5.718 (1.684)* -2.368 (1.590) -1.027 (1.767) -3.037) (1.377)** -1.033 (2.387) 1.298 (0.941) 1.186 (1.634) -1.309 (1.436) -4.045 (0.995)* 3.345 (1.169)* 1.025 (1.007) 1.219 (1.013) 3.800 (1.422)* -3.373 (1.181)* 4.684 (1.104)* 2.871 (1.459)** 90.318 (3.580)* -5.707 (1.681)* -2.366 (1.587) -1.026 (1.764) -3.033 (1.375)** -1.030 (2.382) 1.295 (0.939) 1.186 (1.631) -1.307 (1.433) -4.039 (0.993)* 3.340 (1.167)* 1.021 (1.005) 1.216 (1.011) 3.794 (1.420)* -2.369 (1.179)* 4.676 (1.102)* 2.869 (1.456)** 82.156 (2.067)* -3.139 (0.989)* -0.938 (0.920) -0.221 (1.014) 82.158 (2.067)* -3.138 (0.988)* -0.938 (0.919) -0.221 (1.013) 82.143 (2.070)* -3.150 (0.990)* -0.940 (0.921) -0.222 (1.015) -1.021 (0.789) -1.021 (0.789) -1.021 (0.790) 0.181 (1.416) 1.089 (0.549)** 0.801 (0.961) 0.001 (0.853) -2.207 (0.587)* 1.453 (0.664)** 0.118 (0.596) 0.340 (0.589) 1.544 (0.806)*** -2.128 (0.695)* 2.213 (0.630)* 0.621 (0.828) 0.182 (1.416) 1.089 (0.549)** 0.800 (0.961) 0.001 (0.853) -2.207 (0.587)* 1.452 (0.663)** 0.118 (0.596) 0.340 (0.588) 1.543 (0.806)*** -2.128 (0.695)* 2.213 (0.630)* 0.621 (0.828) 0.182 (1.418) 1.094 (0.550)** 0.801 (0.963) 0.001 (0.855) -2.208 (0.588)* 1.453 (0.665)** 0.120 (0.598) 0.342 (0.590) 1.551 (0.807)*** -2.127 (0.696)* 2.217 (0.631)* 0.618 (0.829) 110.105 (4.717)* -4.260 (1.994)** -0.687 (1.881) -0.844 (2.124) -3.108 (1.629)*** 1.155 (3.254) 1.986 (1.114)*** 1.737 (1.997) -0.219 (1.708) -4.114 (1.189)* 4.309 (1.378)* 1.126 (1.211) 1.080 (1.214) 1.222 (1.624) -3.817 (1.446)* 2.768 (1.322)** 4.015 (1.762)** 109.706 (4.583)* -4.176 (1.931)** -0.706 (1.824) -0.824 (2.060) -3.050 (1.58) *** 1.185 (3.160) 1.883 (1.09)*** 1.661 (1.937) -0.249 (1.657) 4.035 (1.152)* 4.171 (1.337)* 1.065 (1.172) 1.037 (1.177) 1.153 (1.571) -3.710 (1.401)* 2.689 (1.281)** 3.961 (1.710)** 109.683 (4.576)* -4.173 (1.928)** -0.707 (1.821) -0.823 (2.056) -3.047 (1.58) *** 1.186 (3.155) 1.878 (1.08)*** 1.658 (1.933) -0.250 (1.657) -4.030 (1.150)* 4.164 (1.335)* 1.062 (1.170) 1.036 (1.175) 1.152 (1.568) -3.704 (1.399)* 2.685 (1.279)** 3.956 (1.707)** 91.146 (2.418)* -2.048 (1.107)*** 0.151 (1.030) -0.001 (1.155) -0.889 (0.884) 1.453 (1.811) 1.308 (0.615)** 1.314 (1.114) 0.837 (0.967) -2.055 (0.666)* 1.701 (0.736)** -0.048 (0.678) 0.188 (0.666) 0.068 (0.873) -2.216 (0.805)* 1.124 (0.713) 1.175 (0.942) 91.147 (2.417)* -2.048 (1.107)*** 0.151 (1.030) -0.001 (1.155) -0.889 (0.884) 1.453 (1.810) 1.308 (0.615)** 1.313 (1.113) 0.836 (0.966) -2.055 (0.666)* 1.701 (0.736)** -0.049 (0.678) 0.188 (0.666) 0.067 (0.873) -2.216 (0.804)* 1.124 (0.713) 1.176 (0.942) 91.135 (2.421)* -2.052 (1.109)*** 0.151 (1.031) -0.001 (1.157) -0.890 (0.885) 1.455 (1.813) 1.315 (0.616)** 1.320 (1.115) 0.842 (0.968) -2.053 (0.667)* 1.703 (0.737)** -0.046 (0.679) 0.191 (0.667) 0.074 (0.875) -2.222 (0.806)* 1.1246 (0.714) 1.169 (0.943) Sole Prop. Partnerships Other Private Corporation Other Industry Service Agriculture Construction Small: <50 employees Large: >500 employees <5 years of age >15 years of age No Competitors >3 Competitors Foreign Owned Gov. Owned 47 Export Audits Obstacle: Financing Obstacle: Political Instability Obstacle: Organized Crime Obstacle: Taxes & Regs. Obstacle: Corruption Obstacle: Inflation Obstacle: Exchange Rate Obstacle: Laws & Regs. In consistent Country Dummies σ (s.e.) Mean ŷ * (s.e.) [min; max] # >100 Pseudo LogLikelihood Value McKelvey & Zavoina Pseudo R2 0.630 (0.971) 5.556 (0.986)* - 0.614 (0.937) 5.361 (0.952)* - 0.614 (0.935) 5.351 (0.950)* - 0.826 (0.544) 3.252 (0.560)* - 0.826 (0.544) 3.252 (0.560)* - 0.829 (0.545) 3.257 (0.561)* - 0.687 (1.125) 5.581 (1.155)* -2.277 (1.058)** -0.031 (1.152) 0.691 (1.091) 5.394 (1.119)* -2.214 (1.025)** -0.020 (1.116) .691 (1.089) 5.384 (1.117)* -2.210 (1.024)** -0.0186 (1.114) 0.883 (0.615) 3.357 (0.642)* -1.124 (0.582)** -0.389 (0.636) 0.883 (0.616) 3.357 (0.642)* -1.124 (0.582)** -0.389 (0.636) 0.883 (0.617) 3.362 (0.643)* -1.127 (0.583)** -0.389 (0.637) - - - - - - - - - - - - -1.787 (1.067)*** 1.728 (1.03)*** -1.724 (1.03)*** -0.687 (0.601) -0.687 (0.601) -0.689 (0.602) - - - - - - - - - - - - - - - - - - - - - - - - -5.187 (1.188)* -11.504 (1.065)* 0.388 (1.236) -1.729 (1.136) 1.701 (0.976)*** -5.037 (1.151) * -11.181 (1.029)* 0.381 (1.198) -1.701 (1.101) 1.652 (0.95)*** -5.028 (1.149) * -11.165 (1.028)* 0.381 (1.196) -1.698 (1.099) 1.649 (0.95)*** -2.561 (0.650)* -6.808 (0.606)* 0.617 (0.683) -1.263 (0.627)** 0.768 (0.542) -2.561 (0.650)* -6.808 (0.605)* 0.617 (0.683) -1.264 (0.627)** 0.769 (0.542) -2.563 (0.651)* -6.819 (0.607)* 0.616 (0.684) -1.260 (0.628)** 0.769 (0.543) - - - - - - Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes 30.487 (0.416) 91.153 (17.213) [21; 142] 2016 29.591 (0.375) 91.100 (16.470) [30; 141] 1960 29.545 (0.374) 91.098 (16.436) [30; 141] 1960 18.749 (0.170) 82.058 (9.810) [35;104] 47 18.749 (0.170) 82.060 (9.804) [36;104] 47 18.778 (0.172) 82.043 (9.858) [36;104] 47 29.763 (0.477) 92.584 (18.062) [5; 144] 1761 28.989 (0.434) 92.520 (17.332) [19; 142] 1724 28.948 (0..433) 92.520 (17.299) [20; 142] 1721 18.083 (0..191) 82.915 (10.021) [25;105] 66 18.080 (0..191) 82.916 (10.015) [26;105] 66 18.102 (0..192) 82.899 (10.083) [23;105] 66 -11829.6 -11866.4 -11869.3 -21158.4 -21158.4 -21156.2 -8508.7 -8532.4 -8534.29 -15648.8 -15649.0 -15647.5 0.098 0.138 0.138 0.215 0.215 0.161 0.108 0.149 0.149 0.235 0.235 0.180 48 LRT-OS [d.o.f.; P-Value] 1648.5 [96; 0.00] 1637.2 [96;0.00] 1636.8 [96;0.00] 1674.9 [96;0.00] 1674.9 [96;0.00] 1677.6 [96;0.00] 1364.45 [102;0.0] 1356.75 [102;0.0] 1356.54 [102;0.0] 1367.43 [102;0.00] 1367.20 [102;0.00] Pseudo 4.557 9.989 Functional Form [.0328] [0..002] Test [P-Value] Homoskedastici 1520.841 1727.343 [0.000] [0.000] ty Test of Unknown Form [P-Value] Normality Test 2448.197 1490.885 [P-Value] [0.000] [0.000] Observations 7320 7320 7320 7320 7320 7320 5393 5393 5393 5393 5393 NOTES: § ***,** and * denote statistical significance at the 10%, 5% and 1% level respectively. ± d.o.f. denotes degrees of freedom. ‡ Omitted category is 1 “Public Corporations”, 2 “Manufacturing”, 3 “Medium”, 4 “Between 5 and 15”, 5 “Between 1 and 3”, 6 “United States”. ₣ Standard errors (s.e.) are noted in parenthesis. 1369.38 [102;0.00] - - 5393 49 Table 6: Estimation Results for Semi-log Model Variable Model 5a 1<lnYi*<100 Constant Sole Prop. Partnerships Other Private Corporation Other Industry Service Agriculture Construction Small: <50 employees Large: >500 employees <5 years of age >15 years of age No Competitors >3 Competitors Foreign Owned Gov. Owned Export Audits Obstacle: Financing Obstacle: Political Instability Obstacle: Organized Crime Obstacle: Taxes & Regs. Model 11a 1<lnYi*<100 1.898 (0.153)* 0.267 (0.073)* 0.106 (0.068) 0.040 (0.075) 0.128 (0.058)** 0.017 (0.105) -0.055 (0.040) -0.087 (0.071) 0.068 (0.063) 0.178 (0.043)* -0.135 (0.049)* Model 5b 1<lnYi*<100 Het. Corr. 1.273 (0.150)* 0.225 (0.066)* 0.065 (0.062) 0.070 (0.072) 0.094 (0.052)*** 0.110 (0.088) -0.041 (0.037) -0.041 (0.065) 0.079 (0.058) 0.179 (0.040)* -0.131 (0.0453)* 1.185 (0.182)* 0.211 (0.082)* 0.031 (0.077) 0.040 (0.086) 0.131 (0.066)** -0.069 (0.136) -0.070 (0.046) -0.096 (0.083) 0.029 (0.072) 0.174 (0.050)* -0.163 (0.055)* Model 11b 1<lnYi*<100 Het. Corr. 0.934 (0.155)* 0.340 (0.070)* 0.060 (0.065) 0.118 (0.083) 0.132 (0.059)** 0.030 (0.125) -0.086 (0.043)** -0.087 (0.080) 0.041 (0.069) 0.201 (0.048)* -0.193 (0.051)* -0.037 (0.044) -0.066 (.043) -0.159 (0.059)* 0.143 (0.051)* -0.203 (0.046)* -0.121 (0.061)** -0.037 (0.040) -.238 (0.041)* - -0.054 (0.039) -0.079 (0.040)** -0.175 (0.056)* 0.116 (0.047)** -0.173 (0.043)* -0.117 (0.056)** -0.048 (0.036) -0.192 (0.038)* - - - -0.041 (0.050) -0.063) (0.050) -0.064 (0.065) 0.148 (0.060)** -0.124 (0.053)** -0.150 (0.071)** -0.049 (0.046) -0.239 (0.048)* 0.096 (0.04)** -0.012 (0.048) -0.057 (0.027)** -0.059 (0.045) -0.119 (0.062)** 0.084 (0.053) -0.011 (0.036) -0.154 (0.065)** -0.024 (0.042) -0.150 (0.042)* 0.045 (0.035) -0.0154 (0.036) - - 0.092 (0.045)** (0.065) (0.042) - - 0.212 (0.049)* 0.202 (0.046)* 50 Obstacle: Corruption Obstacle: Inflation Obstacle: Exchange Rate Obstacle: Laws & Regs. In consistent Country Dummies σ (s.e.) Pseudo LogLikelihood Value McKelvey & Zavoina Pseudo R2 LRT-OS [d.o.f.; P-Value] - - - - - - - - Yes 0.527 (0.045)* -0.013 (0.051) 0.068 (0.047) -0.067 (0.04)*** 0.534 (0.046)* 0.028 (0.044) -0.057 (0.027)** -0.046 (0.039) Yes Yes Yes 1.379 (0.012) - 1.347 (0.013) - -16242.508 -16010.435 -11788.379 -11614.318 0.200 1589.549 [96; 0.000] 0.228 4475.80 [96; 0.000] 1357.86 [102;0.000] 16652.51 [102; 0.000] Pseudo 0.757 0.001 Functional Form [0.384] [0.979] Test [P-Value] Homoskedastici 1062.329 1111.411 ty Test of [0.000] [0.000] Unknown Form [P-Value] Normality Test 3409.997 1699.6542 [P-Value] [0.000] [0.000] Observations 7320 7320 5393 5393 NOTES: § ***,** and * denote statistical significance at the 10%, 5% and 1% level respectively. ± d.o.f. denotes degrees of freedom. ‡ Omitted category is 1 “Public Corporations”, 2 “Manufacturing”, 3 “Medium”, 4 “Between 5 and 15”, “Between 1 and 3”, 6 “United States”. ₣ Standard errors (s.e.) are noted in parenthesis. 5 51 18.27% 100% 10.74% 16.56% 6.659% 44.43% 33.65% 90-99% Latin America 10.85% Source: World Business Environment Survey 8.562% 6.435% 6.603% 11.13% 9.615% 9.615% 6.868% Africa and Middle East 80-89% 18.83% 70-79% OECD 15.09% 3.247% 2.078% 2.857% 5.584% 11.7% 9.481% 9.007% 5.312% 8.083% 8.16% Transition Europe 57.92% 42.65% 7.967% 14.28% 11.09% 5.417% 60-69% 8.097% 9.198% 11.16% 14.53% 50-59% 8.031% <50% 38.69% 32.7% Former Soviet Union 5.503% 11.08% 22.25% Asia Figure 1: Percentage of Sales Reported to Tax Authorities FIGURES 52 Figure 2: Histograms of Pseudo Residuals 0 .2 Density .4 .6 Histogram of Pseudo Residuals for Model 5a -2 -1 0 Pseudo Residuals 1 2 0 .2 Density .4 .6 Histogram of Pseudo Residuals for Model 11a -2 -1 0 Pseudo Residuals 1 2 53 REFERENCES Allingham, M. and A. Sandmo. 1972. Income Tax Evasion: A Theoretical Analysis. Journal of Public Economics 1:323-338. Andreoni, James. 1992. IRS as Loan Shark: Tax Compliance with Borrowing Constraints. Journal of Public Economics 49:35-46. ______. Brian Erard, and Jonathan Feinstein. 1998. Tax Compliance. Journal of Economic Literature 36:818-860. Apel, Mikael. 1994. An Expenditure-Based Estimate of Tax Evasion in Sweden. Working Paper, Department of Economics, Uppsala University. Batra, Geeta, Daniel Kaufmann and Andrew H.W. Stone. 2003. Investment Climate Around the World: Voices of the Firms from the World Business Environment Survey. Washington, D.C.: The World Bank. Border, Kim C. and Joel Sobel. 1987. Samurai Accountant: A Theory of Audit and Plunder. Review of Economic Studies 54:525-540. Breham, J.W (1966) A Theory of Psychological Reactance. Academic Press: New York. Chan, K. Hung and Phyllis Lai Lan Mo. 2000. Tax Holidays and Tax Non-compliance: An Empirical Study of Corporate Tax Audits in China’s Developing Economy. The Accounting Review 75:469-484. Chen, Kong-Pin and C. Y. Cyrus Chu. 2005. Internal Control and External Manipulation: A Model of Corporate Income Tax Evasion. RAND Journal of Economics 26:151-164. Chesher, A. and M. Irish. 1987. Residual Analysis in the Grouped and Censored Normal Regression Model. Journal of Econometrics 34:33–61. Christian, Charles W. 1994. Voluntary Compliance with Individual Income Tax: Results from the 1988 TCMP study. The IRS Research Bulletin Publication 1500 9:35-42. Cowell, Frank. 2004. Sticks and Carrots. In The Crises in Tax Administration , eds. Henry Aaron and Joel Slemrod, 230-258. Washington, D.C.: The Brookings Institution Press. Cremer, Helmuth, Maurice Marchand and Pierre Pestieau. 1990. Evading, Auditing and Taxing: the Equity-Compliance Tradeoff. Journal of Public Economics 43:67-92. Crocker, Joel and Joel Slemrod. 2005. Corporate Tax Evasion with Agency Costs. Journal of Public Economics 89: 1593-1610. 54 Deregulation Commission (1991) Opening of Markets and Competition, Report Presented to German Federal Government, Bonn. Erard, Brian. 1992. The Influence of Tax Auditing on Reporting Behaviour. In Why People Pay Taxes, ed. Joel Slemrod. Ann Arbor: University of Michigan. ______ and Chih-Chin Ho. 2001. Searching for Ghosts: Who are the Nonfilers and How Much Tax do they Owe? Journal of Public Economics 81:25-50. Giles, David E.A. 2000. Modeling the Tax Compliance Profiles of New Zealand Firms. In Taxation and the Limits of Government , eds. Gerald W. Scully and Patrick J. Caragata, 243-270. Boston: Kluwer Academic Publishers. ______ and Lindsay M. Tedds. 2002. Taxes and the Canadian Underground Economy. Toronto: The Canadian Tax Foundation. Hanlon, Michelle, Lillian Mills, and Joel Slemrod. 2005. An Empirical Examination of Corporate Tax Non-compliance. Working Paper, University of Michigan. Harvey, Andrew. 1976. Estimating Regression Models with Multiplicative Heterocedasticity. Econometrica 44: 461-65. Huber, P.J. 1967. The Behaviour of Maximum Likelihood Estimates under Non-standard Conditions. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, 221-223. Berkeley: University of California Press. Internal Revenue Service. 2004. Interactive Tax Gap Map. National Headquarters Office of Research. Johnson, Simon Daniel Kaufamnn and Pablo Zoido-Lobatόn (1998a) “Regulatory Discretion and the Unofficial Economy,” 88 American Economic Revue 387-392. ______ (1998b) Corruption, Public Finances and the Unofficial Economy. MIT Press: Cambridge. Johnson, Simon, Daniel Kaufmann, John McMillan, and Christopher Woodruff. 2000. Why do firms hide? Bribes and Unofficial Activity After Communism. Journal of Public Economics 76:495-520. Joulfaian, David. 2000. Corporate Tax Evasion and Managerial Preferences. Review of Economics and Statistics 82:698-701. ______ and Mark Rider. 1998. Differential Taxation and Tax Evasion by Small Business. National Tax Journal 51:676-87. 55 Kennedy, P. 1981. Estimation with Correctly Interpreted Dummy Variables in Semilogarithmic Equations. American Economic Review 71: 801. Lyssiotou, Panayiota, Panos Pashardes and Thanasis Stengos. 2004. Estimates of the Black Economy Based on Consumer Demand Approaches. Economic Journal 114:622-640. Machin, S., and M. Stewart. 1990. Unions and the Financial Performance of British Private Sector Establishments. Journal of Applied Econometrics 5:327–350. McElvey, W. and Zavoina, R. 1975. A Statistical Model for the Analysis of Ordinal Level Dependent Variables. Journal of Mathematical Sociology 4:103-120. Mirus, Rolf and Roger S. Smith. 1997. Self-Employment, Tax Evasion, and the Underground Economy: Micro-Based Estimates for Canada. Working Paper, International Tax Program, Harvard Law School. Mookherjee, Dilip and Ivan P.L. P’ng. 1989. Optimal Auditing, Insurance and Redistribution. Quarterly Journal of Economics 104:399-415. Orme, C. 1990. The Small-Sample Performance of the Information Matrix Test. Journal of Econometrics 46:309–331. Papke, Leslie E. and Jeffrey M. Wooldridge. 2005. Panel Data Methods for Fractional Response Variables with an Application to Test Pass Rates. Working Paper, Department of Economics, Michigan State University. Pissarides, Christopher A., and Guglielmo Weber. 1989. An Expenditure-based Estimate of Britain’s Black Economy. Journal of Public Economics 39:17-32. Ramsey, J.B. 1969. Tests for Specification Errors in Classical Linear Least-Squares Regression Analysis. Journal of the Royal Statistical Society 31:350 – 71. Rice, Eric M. 1992. The Corporate Tax Gap: Evidence on Tax Compliance by Small Corporations. In Why People Pay Taxes, ed. Joel Slemrod, 125-166. Ann Arbor: University of Michigan Press: Ann Arbor. Sanchez, Isabel and Joel Sobel. 1993. Hierarchical Design and Enforcement of Income Tax Policies. Journal of Public Economics 50:345-369. Schuetze, Herb J. 2002. Profiles of Tax Non-compliance Among the Self-Employed in Canada, 1969-1992. Canadian Public Policy XXVIII: 219-237. Schnieder, Friedrich and Dominik Enste (2000) “Increasing Shadow Economies All Over the World – Fiction or Reality? A Survey of the Global Evidence of Their Size and Their Impact From 1970 to 1995.” 38 Journal of Economic Literature 77-114. 56 Scotchmer, Suzanne. 1987. Audit Classes and Tax Enforcement Policy. American Economic Review 77:229-233. Slemrod, Joel. 2004. The Economics of Corporate Tax Selfishness. Working Paper 10858, National Bureau of Economic Research. ______ and Shlomo Yitzhaki. 1996. The Cost of Taxation and the Marginal Efficiency Cost of Funds. International Monetary Fund Staff Papers 43:25-34. Smith, J. and T. Adams. 1987. The Measurement of Selected Income Flows in Informal Markets, 1981 and 1985-1986. Working Paper, University of Michigan Survey Research Centre. Smith, Stephen with Christopher A. Pissarides and Guglielmo Weber. 1986. Evidence from Survey Discrepancies. In Britain’s Shadow Economy, ed. Stephen Smith, 137-153. Oxford: Clarendon Press. Stewart, M. 1983. On Least Squares Estimation when the Dependent Variable is Grouped. Review of Economic Studies 50:737-753. Tannenbaum, Robert. 1993. Corporate Tax Disclosure: Good or Bad for the Commonwealth? Working Paper, Massachusetts Special Commission on Business Tax Policy. Tedds, Lindsay M. 2005. Nonparametric Expenditure-based Estimation of Income Underreporting and the Underground Economy. Working Paper, Department of Economics, McMaster University. Trandel, Greg and Arthur Snow. 1999. Progressive Income Taxation and the Underground Economy. Economics Letters 62:217-222. Veall, Michael R. and Kurt F. Zimmermann. 1996. Pseudo-R2 Measures for Some Common Limited Dependent Variable Models. Journal of Economic Surveys 10:241-259. Watson, Harry. 1985. Tax Evasion and Labor Markets. Journal of Public Economics 27:231246. World Bank. 1997. World Development Report 1997: The State in a Changing World. Washington, D.C.: World Bank Group. Wooldridge, Jeffrey M. 2002. Econometric Analysis of Cross Section and Panel Data. Cambridge: The MIT Press. 57 APPENDIX A. SURVEY DESIGN1 The World Business Environment Survey (WBES) was conducted over a 20 month period between the latter half of 1998 and mid-2000. The WBES consists of a core survey instrument alongside a subsidiary module that addresses issues of importance to particular regions. The core survey covers a broad range of measures for business environment and firm attributes. The survey was completed by over 10,000 firms across eighty countries and the West Bank and Gaza. Approximately, one third of these countries were low income, one half were middle income and a little less than one fifth were high income. There were seven countries for East Asia, twenty-three from Eastern Europe and Central Asia, twenty from Latin America and the Caribbean, nineteen from the Middle East and North Africa, nine from the Organization for Economic Co-operation and development (OECD), and three from South Asia. In Africa, surveys were conducted primarily by mail while direct interviews were the predominant method used for all other firms in the sample. Response rates were lowest in Africa but generally satisfactory in other region. The WBES used a form of stratified random sampling along with oversampling and nonrandom methods to correct for biases in the representation across firms with respect to industry characteristics that were not common. To ensure adequate representation of firms by industry, size, ownership, export orientation, and location, the following sampling targets were agreed on across all regions: The number of manufacturing versus service firms were allocated according to their contribution to GDP, with a 15 percent minimum for each type of firm 1 The information presented here is based on that provided in Batra et al (2003). 58 At least 15 percent of the sample was in the small category (fewer than 50 employees) and at least 15 percent was in the large category (more than 500 employees) At least 15 percent of the companies in the sample were firms with foreign content At least 15 percent of firms exported at least 20 percent of their output At least 15 percent of firms were located in small towns or countryside In addition, the sample size for each country or territory was greater than or equal to 100. Of the firms in the final sample, twenty percent of the firms were large (more than 500 employees) with the remaining 80 percent spit almost equally between small (fewer than 50 employees) and medium sized firms. Firms were also groups into five broad industry categories as follow: 43% in services and commerce; 35% in manufacturing; 9% in construction; 7% in agriculture and 4% in other. Firms that fell into the other category were largely from the SubSahara and North Africa. The majority of firms were domestically owned and privately owned. One third of the firms were export oriented, with the highest proportion in the Middle East and North Africa. Finally, the most predominant form of organization were privately held corporations followed by sole proprietorships, partnerships, other, public corporations, and finally cooperatives, respectively. Unfortunately, the survey does not contain survey weights. The lack of survey weights means that individual indicators should not be used for precise country rankings in any particular dimension measured by the survey. 59