ARTICLE IN PRESS ADIAC-00048; No of Pages 9 Advances in Accounting, incorporating Advances in International Accounting xxx (2009) xxx–xxx Contents lists available at ScienceDirect Advances in Accounting, incorporating Advances in International Accounting j o u r n a l h o m e p a g e : w w w. e l s e v i e r. c o m / l o c a t e / a d i a c 1 Monetary unit sampling: Improving estimation of the total audit error 2 Huong N. Higgins a,⁎, Balgobin Nandram b,1 a b Worcester Polytechnic Institute, Department of Management, 100 Institute Road, Worcester, MA 01609, United States Worcester Polytechnic Institute, Department of Mathematical Sciences, 100 Institute Road, Worcester, MA 01609, United States OO F 3 4 5 a r t i c l e i n f o a b s t r a c t In the practice of auditing, for cost concerns, auditors verify only a sample of accounts to estimate the error of the total population of accounts. The most common statistical method to select an audit sample is by monetary unit sampling (MUS). However, common MUS estimation practice does not explicitly recognize the multiple distributions within the population of account errors. This often leads to excessive conservatism in auditors' judgment of population error. In this paper, we review the common MUS estimation practice, and introduce our own method which uses the Zero-Inflation Poisson (ZIP) distribution to consider zero versus non-zero errors explicitly. We argue that our method is better suited to handle the real populations of account errors, and show that our ZIP upper bound is both reliable and efficient for MUS estimation of accounting data. © 2009 Elsevier Ltd. All rights reserved. Available online xxxx PR 6 7 Q1 8 910 11 ED 25 24 1. Introduction 27 In auditing a company's accounts receivable, the goal of auditors is to verify if the values of the accounts reported by the company are not materially misstated. To determine the total error in the amount reported by the company, auditors must audit all accounts in the population of accounts. However, it is costly and it takes much time to audit all accounts. In practice, auditors audit only a sample to estimate the total error in the population of accounts. A common statistical sampling method is monetary unit sampling (MUS), where each monetary unit (e.g., dollar) is equally likely to be included in the sample. In this method, an account with a book value of $10,000 is ten times as likely to be sampled as an account with a book value of $1000. This paper addresses the estimation process for MUS sampling. Common MUS estimation has a shortcoming because it does not explicitly recognize that a total population of account errors typically consists of distinct distributions, namely one large mass with zero error, a second distribution of small errors, and a third distribution of 100% errors. These distribution characteristics of accounting error populations have been discussed in prior research (e.g. Kaplan, 1973; Neter & Loebbecke, 1975; Chan, 1988). Due to this shortcoming, in practice sample accounts are incorrectly assumed to have similar tainting (ratio of Error-Per-Dollar) to non-sample accounts. This 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 RE 32 33 CO R 30 31 UN 28 29 assumption, combined with MUS sampling bias towards selecting larger accounts, often leads to very large estimations of total error in the population and overly conservative auditors' decisions. We acknowledge that the major risk of accounts receivable in financial reporting is overstatement, so generally auditors should prefer methods that result in larger error estimations. However, we argue that estimations in practice are too conservative, and excessive conservatism has its own practical problems. For example, discussions with auditors in three large accounting firms reveal that clients rarely approve large adjustments for error estimations (Elder & Allen, 1998). This, along with the cost of statistical expertise, may cause auditors not to project sample errors to the population at all. Indeed, surveys of auditors reveal that they often fail to project errors to populations, notwithstanding audit standards and professional guidelines (Burgstahler & Jiambalvo, 1986; Akresh & Tatum, 1988; Hitzig, 1995; Elder & Allen, 1998). Overall, excessive conservatism in the estimation method may lead to projection problems, which impairs the auditor's overall decision process. We introduce our own approach which is better suited to handle the real distributions of account errors. We use a regression with selection probability as a covariate to address MUS sampling bias. Our technical innovation lies in the development of the Zero-Inflation Poisson (ZIP) model, using the Poisson distribution to treat errors as rare count data, and increasing the probabilities of zero errors to inflate the weights of these observations. By handling accounts with zero versus non-zero errors explicitly, our technique does not rely on the assumption of similar tainting in the population. We provide mathematical expressions and, upon request, will furnish the software codes for our ZIP estimation and confidence intervals. Through simulations, we show that our ZIP bound is reliable and efficient for CT 26 ⁎ Corresponding author. Tel.: +1 508 831 5626; fax: +1 508 831 5720. E-mail addresses: hhiggins@wpi.edu (H.N. Higgins), balnan@wpi.edu (B. Nandram). 1 Tel.: +1 508 831 5539; fax: +1 508 831 5824. 0882-6110/$ – see front matter © 2009 Elsevier Ltd. All rights reserved. doi:10.1016/j.adiac.2009.06.001 Please cite this article as: Higgins, H. N., & Nandram, B., Monetary unit sampling: Improving estimation of the total audit error, Advances in Accounting, incorporating Advances in International Accounting (2009), doi:10.1016/j.adiac.2009.06.001 13 14 12 15 16 17 18 19 20 21 22 23 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 ARTICLE IN PRESS 90 2. Monetary unit sampling 91 92 In performing substantive tests of accounts, an auditor often takes a sample to estimate the true total balance of all accounts. Ultimately, the auditor's goal is to estimate the total amount of error, which is the difference between the true total balance of all accounts and the reported balance amount. The standards for audit sampling are set by the Audit Standards Board's Statement on Auditing Standards (SAS) No 39, (AICPA, 1981) amended by SAS No. 111 (AICPA, 2006). Both statistical and non-statistical sampling methods are allowed for substantive tests in auditing (AICPA, 1981, 2008). Both are used by a wide-range of public accounting firms (Nelson, 1995) and government agencies (Annulli, Mulrow & Anziano, 2000). Some surveys show that non-statistical sampling is used more than half the time (Annulli et al., 2000; Hall, Hunton & Pierce, 2000, 2002). Of course, non-statistical sampling procedures do not afford the full control provided by statistical theory, and so raise questions concerning auditors' evaluation of sample results (Messier, Kachelmeier & Jensen, 2001). Among statistical sampling techniques, monetary unit sampling (MUS) is the most commonly used for substantive tests (Tsui, Matsumura & Tsui, 1985; Smieliauskas, 1986b; Grimlund & Felix, 1987; Hansen, 1993; Annulli et al., 2000; AICPA, 2008). The authoritative guide for the auditing profession, the AICPA's Audit Guide — Audit Sampling (AICPA, 2008), has detailed MUS instructions. MUS is a statistical sampling method where the probability of an item's selection for the sample is proportional to its recorded amount (probability proportional to size). MUS can be thought of as employing the ultimate in stratification by book amount. No further stratification by book amount is possible with dollar units because all sampling units are of exactly the same size in terms of book value. Consequently, MUS incorporates efficiency advantages similar to those of stratification by book value without requiring stratification. Despite its widespread use, the relative performance of MUS compared to traditional normal distribution variables is often not clear (Smieliauskas, 1986b). Prior research has developed a number of methods for evaluating MUS samples (Tsui et al., 1985; Smieliauskas, 1986b; Grimlund & Felix, 1987). Evaluating criteria include sample size (Kaplan, 1975), sampling risks (Smieliauskas, 1986b), sample size implications of controlling for the same level of sampling risks (Smieliauskas, 1986b), and bounds (Tsui et al., 1985; Dworin & Grimlund, 1986). Obtaining reliable bounds on the total error in the population is desirable for making decisions at different confidence levels and probabilistic statements. There are many methods to compute bounds for MUS sampling (Felix, Leslie & Neter, 1981; Swinamer, Lesperance & Will, 2004). The Stringer bound, introduced by Stringer (1963), is used extensively by auditors (Bickel, 1992; AICPA, 2008). A feature of thee Stringer bound which is particularly attractive to auditors is that it provides a non-zero upper bound even when no errors are observed in the sample. Simulations show that the Stringer bound reliably exceeds the true audit error (Swinamer et al., 2000). This reliability is favorable to auditors who are concerned with strictly overstatements (or strictly understatements) in financial statements. The Stringer 86 87 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 Q2 140 141 142 3. Data and common estimation 145 3.1. Data 146 To illustrate the common estimation method, we use data of a company used by Lohr (1999) in her demonstration of MUS. The company has a population of N = 87 accounts receivable, with a total book balance of $612,824. We know the book values b1, b2, …, bN for each account in the population. Let B denote the total book value of all accounts receivable in the population, 147 148 B = b1 + b2 + … + bN = 612; 824: 149 150 151 152 ð1Þ 153 154 If each account in the population were audited, we would obtain 155 the set of audit values a1, a2, …, aN. Let A denote the unknown total 156 audit value in our data, 157 A = a1 + a2 + … + aN : ð2Þ We define the error on any account i (i = 1, 2, …, N) by di = bi − ai, di ≥ 0. After performing audit on a sample size n, which we suppose is predetermined, we observe a1, a2, …, a n (see Kaplan, 1975; Menzefricke, 1984 for discussions of issues in determining the sample size). We wish to fit an + 1, …, aN to predict the total error from all accounts. Let D denote the total error in the population of all accounts receivable, D = d1 + d2 + … + dN : 158 159 160 161 162 163 164 165 166 167 ð3Þ 168 169 The total error D is also the difference between the total book value 170 and the total audit value, 171 D = B−A: ð4Þ 172 173 Our ultimate goal is to predict total error D. Because D is unknown, it is standard practice to estimate its mean and upper bound. The book values, audit values, selection probabilities and errors of all accounts are tabulated in Table 1. Selection probability is the probability of an account being selected from all the 87 accounts, equal to the book value of an account divided by the total book balance $612,824 (e.g. for account 3, the selection probability = 6842/612,824 = 0.011164706). Panel A describes the audit sample, and Panel B the nonsample. The audit values and errors of accounts in the audit sample are known, but these values are unknown for accounts in the non-sample. In this company, only a small proportion of audited accounts (4 of 20, or 80%) are subject to error. 174 175 3.2. Common estimation 187 SAS No. 39 on audit sampling (AICPA, 1981) promulgates that auditors should estimate the total error in the population by projecting sample errors to the population. A common estimation method of the mean population error D is based on tainting, which equals the average error amount per dollar in the audited accounts. Taintings are multiplied by the total dollars in the sampling intervals, or average tainting is multiplied by the total dollars in the population, to yield an estimate of the total error dollars in the population. This method is prescribed by the AICPA Audit Guide — Audit Sampling (AICPA, 2008). Besides the AICPA guide, many professional publications also prescribe this MUS evaluation method (Gafford & Carmichael, 1984; Guy & Carmichael, 1986; Schwartz, 1997; Yancey, 2002). 188 189 RR EC 85 CO 83 84 UN 81 82 bound is also simple to compute and the required statistical tables are 143 readily available. 144 TE 88 89 MUS estimation of accounting data. Overall, our paper serves professionals and academicians by pointing out an improvement potential for MUS estimation. The remainder of the paper is as follows. Section 2 reviews the monetary unit sampling method in the accounting literature and practice of auditing. Section 3 describes our data and demonstrates a common method in current practice to estimate the total error in a population of accounts from which a dollar unit sampling is taken. Section 4 introduces our Zero-Inflated Poisson (ZIP) estimation method. Section 5 reports simulations to validate our ZIP upper bound. Section 6 concludes the paper with a brief summary. RO OF 79 80 H.N. Higgins, B. Nandram / Advances in Accounting, incorporating Advances in International Accounting xxx (2009) xxx–xxx DP 2 Please cite this article as: Higgins, H. N., & Nandram, B., Monetary unit sampling: Improving estimation of the total audit error, Advances in Accounting, incorporating Advances in International Accounting (2009), doi:10.1016/j.adiac.2009.06.001 176 177 178 179 180 181 182 183 184 185 186 190 191 192 193 194 195 196 197 198 199 200 ARTICLE IN PRESS H.N. Higgins, B. Nandram / Advances in Accounting, incorporating Advances in International Accounting xxx (2009) xxx–xxx t1:1 t1:2 t1:3 3 Table 1 Description of data. Panel A: sample Account number Book value (b) Audit value (a) Selection probability (πi) Error (d = b − a) t1:5 t1:6 t1:7 t1:8 t1:9 t1:10 t1:11 t1:12 t1:13 t1:14 t1:15 t1:16 t1:17 t1:18 t1:19 t1:20 t1:21 t1:22 t1:23 t1:24 t1:25 t1:26 t1:27 3 9 13 24 29 34 36 43 44 45 46 49 55 56 61 70 74 75 79 81 Total error in the sample 6842 16,350 3935 7090 5533 2163 2399 8941 3716 8663 69,540 6881 70,100 6467 21,000 3847 2422 2291 4667 31,257 6842 16,350 3935 7050 5533 2163 2149 8941 3716 8663 69,000 6881 70,100 6467 21,000 3847 2422 2191 4667 31,257 0.011164706 0.026679765 0.006421093 0.01156939 0.009028693 0.003529562 0.003914664 0.014589833 0.006063731 0.014136196 0.113474668 0.011228346 0.11438847 0.010552785 0.034267587 0.006277496 0.003952195 0.003738431 0.007615563 0.051004856 0 0 0 40 0 0 250 0 0 0 540 0 0 0 0 0 0 100 0 0 940 t1:28 Account number Book value (b) Audit value (a) Selection probability (πi) Error (d = b − a) Account number Book value (b) Audit value (a) Selection probability (πi) Error (d = b − a) t1:29 t1:30 t1:31 t1:32 t1:33 t1:34 t1:35 t1:36 t1:37 t1:38 t1:39 t1:40 t1:41 t1:42 t1:43 t1:44 t1:45 t1:46 t1:47 t1:48 t1:49 t1:50 t1:51 t1:52 t1:53 t1:54 t1:55 t1:56 t1:57 t1:58 t1:59 t1:60 t1:61 t1:62 1 2 4 5 6 7 8 10 11 12 14 15 16 17 18 19 20 21 22 23 25 26 27 28 30 31 32 33 35 37 38 39 40 41 2459 2343 4179 750 2708 3073 4742 5424 9539 3108 900 7835 1091 2798 5432 2325 1298 5594 2351 7304 4711 4031 1907 3341 8251 4389 5697 7554 8413 4261 7862 3153 4690 6541 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 0.004013 0.003823 0.006819 0.001224 0.004419 0.005014 0.007738 0.008851 0.015566 0.005072 0.001469 0.012785 0.00178 0.004566 0.008864 0.003794 0.002118 0.009128 0.003836 0.011919 0.007687 0.006578 0.003112 0.005452 0.013464 0.007162 0.009296 0.012327 0.013728 0.006953 0.012829 0.005145 0.007653 0.010674 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 47 48 50 51 52 53 54 57 58 59 60 62 63 64 65 66 67 68 69 71 72 73 76 77 78 80 82 83 84 85 86 87 9074 8746 7141 2278 3916 2192 5999 5856 7642 8846 2486 2074 3081 7123 5496 7461 6333 13,597 1317 5437 4030 2620 2416 5882 6596 2626 7571 1331 5924 4356 6618 5658 6943 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 0.014807 0.014272 0.011653 0.003717 0.00639 0.003577 0.009789 0.009556 0.01247 0.014435 0.004057 0.003384 0.005028 0.011623 0.008968 0.012175 0.010334 0.022187 0.002149 0.008872 0.006576 0.004275 0.003942 0.009598 0.010763 0.004285 0.012354 0.002172 0.009667 0.007108 0.010799 0.009233 0.01133 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . t1:63 Book value is the amount reported by the audited company. Audit value is the value audited and verified by the auditors. Selection probability is the probability of which an account is selected from all the accounts in the population, equal to the book value of an account divided by the total book balance $612,824 (e.g. for account 3, the selection probability = 6842 / 612824 = 0.011164706). Error is the difference between book value and audit value. Panel A shows the accounts selected by the auditors to be audited (sample), and Panel B the remainder of the population of accounts (non-sample). Accounts in the non-sample do not have audit value. 201 202 203 204 205 206 207 UN CO R RE CT ED PR Panel B: non-sample OO F t1:4 Table 2 illustrates this estimation process. Error-Per-Dollar (tainting) indicates the difference per dollar for each account, equal to the difference in book and audit values divided by the corresponding book value. Account 24, for example, has a book value of $7090 and an error of $40. The error is prorated to every dollar in the book value, leading to an error (tainting) percentage of $40/$7090 = 0.005642 for each of the 7090 dollars. The average error for the individual dollars in the sample is the sum of all Errors-Per-Dollar divided by 20, the number of accounts in the sample, or 0.008063. Thus the total error for the population D is estimated as 0.008063 ⁎ 612,824 = $4941. Often, audit units greater than the sampling interval are excluded from the population before sampling (AICPA, 2008). These units are examined separately, and their projected misstatements equal their full misstatements, not an extension of tainting. For example, the Please cite this article as: Higgins, H. N., & Nandram, B., Monetary unit sampling: Improving estimation of the total audit error, Advances in Accounting, incorporating Advances in International Accounting (2009), doi:10.1016/j.adiac.2009.06.001 208 209 210 211 212 213 214 ARTICLE IN PRESS 4 t2:1 t2:2 t2:3 H.N. Higgins, B. Nandram / Advances in Accounting, incorporating Advances in International Accounting xxx (2009) xxx–xxx Table 2 Common error estimation for dollar unit sampling. Account number Book value (b) Audit value (a) Selection probability (πi) Error (d = b − a) Error-Per-Dollar (tainting) 6842 16,350 3935 7050 5533 2163 2149 8941 3716 8663 69,000 6881 70,100 6467 21,000 3847 2422 2191 4667 31,257 0.011164706 0.026679765 0.006421093 0.01156939 0.009028693 0.003529562 0.003914664 0.014589833 0.006063731 0.014136196 0.113474668 0.011228346 0.11438847 0.010552785 0.034267587 0.006277496 0.003952195 0.003738431 0.007615563 0.051004856 0 0 0 40 0 0 250 0 0 0 540b 0 0 0 0 0 0 100 0 0 940 0 0 0 0.005642 0 0 0.10421 0 0 0 0.007765 0 0 0 0 0 0 0.043649 0 0 Projected mistatement (taintinga sampling interval) t2:4 t2:5 t2:6 t2:7 t2:8 t2:9 t2:10 t2:11 t2:12 t2:13 t2:14 t2:15 t2:16 t2:17 t2:18 t2:19 t2:20 t2:21 t2:22 t2:23 t2:24 t2:25 t2:26 3 6842 9 16,350 13 3935 24 7090 29 5533 34 2163 36 2399 43 8941 44 3716 45 8663 46 69,540b 49 6881 55 70,100 56 6467 61 21,000 70 3847 74 2422 75 2291 79 4667 81 31,257 Total error in the sample Average Error-Per-Dollar Total projected error in the population t2:27 t2:29 t2:28 t2:30 Book value is the amount reported by the audited company. Audit value is the value audited and verified by the auditors. Selection probability is the probability of which an account is selected from all the accounts in the population, equal to the book value of an account divided by the total book balance. Error is the difference between book value and audit value. Error-Per-Dollar is the ratio between error and book value. The sampling interval is $612,824/20 = $30,641.20. Total error D for the population is estimated as Average Error-Per-Dollar ⁎ total book value, or 0.008063 ⁎ 612,824 = $4941. a The Average Error-Per-Dollar is the sum of all error-per-dollars divided by the number of accounts in the sample (20). b This audit unit is greater than the sampling interval and may be considered separately from the sample evaluation. 218 219 220 221 222 223 n k=j + 1 226 n k n−k p ð1−pÞ = 1−α; k ð5Þ if j b n and p(n; 1 − α) = 1. The Stringer bound for the mean of tainting μ is M CO ð6Þ j=1 232 233 234 235 236 237 238 239 240 241 242 243 244 The upper bound of the total error in the population is μ–ST ⁎ B. There is a similar bound for the case in which X has a Poisson (n ⁎ p) distribution. The AICPA (2008) publishes tables as aids in calculating the Stringer upper bound (see Table C.3. “Monetary Unit Sampling — Confidence Factors for Sample Evaluation” in Appendix C of the Audit Guide). Our computation of the Stringer 95% upper bound based on the binomial distribution for calculating p(n;1 − α) is $129,380. A replication of the Stringer computation using the Poisson instead of binomial distribution for p(n;1 − α) yields a similar value, $133,518. The above results, which are based on the common estimation approach, can be summarized as follows. The sample contains known errors of $940. The total error projected in the population is $4941. The Stringer 95% upper confidence bound of total error in the population is $133,518. Thus, there is a 5% risk that the recorded amount of $612,824 is overstated by more than $133,518. UN 230 231 RO OF DP 4941 μ ST ≡ pð0; 1−αÞ + ∑ ½pðj; 1−αÞ−pðj−1; 1−αÞzj : 227 228 229 1337.46 0.008063a projection from account 46 would be its full misstatement of $540 instead of $237.94. In this case, the total projected error in the population would be $5243 instead of $4941. A common method to estimate the upper bound of D is the Stringer bound. Let Ti be the tainting of the i-th selected item, Ti ≡ di/ bi. If M ≡ number of non-zero Ti let 0 b zM ≤ … ≤ z1 be the ordered non zero Ti. Let p(j; 1 − α) be the 1 − α exact upper confidence bound for p when X has a binomial (n, p) distribution and X = j. Thus p(j; 1 − α) is the unique solution of ∑ 225 224 237.94b We argue that the approach described above is too conservative. Compared to other bounds, for example the Multinomial-Dirichlet bound computed at $11,125 for the same sample, the Stringer bound is too large. References for Multinomial bounds can be found in Swinamer et al. (2004) and Neter, Leitch and Fienberg (1978). As Stringer (1963) stresses, if the population is free of errors it gives the same answer (overestimate), no matter what sample is drawn from the population. In fact, simulation studies (Reneau, 1978; Leitch, Neter, Plante & Sinha, 1982) find that the Stringer bound is always too conservative. The trade-off for high reliability is loss of efficiency: the Stringer bound is always much larger than the population total error (Bickel, 1992; Swinamer et al., 2004). It is probable that many auditors do not project sample results to populations to avoid unrealistic conservatism and client resistance. However, failing to project leads to judgment error regarding the financial statements overall. We seek to introduce an alternative method that is reliable but more efficient, to help auditors make realistic projection and probabilistic statements about the population error. 245 4. Zero-Inflated Poisson regression 263 To improve on the common estimation method described in Section 3, our aim is to estimate total error D and its 95% upper confidence bound. In the first step, we develop the Poisson regression to view each dollar in an account either as error or not (as count data). In the second step, we further develop the Poisson regression to count data with a large number of zero values to suit populations of account errors. 264 265 4.1. Ordinary Poisson regression 271 We use the generalized regression model to allow for count data so that we can capture the distribution of our data more accurately (see Nandram, Sedransk & Pickle, 2000 for a detailed discussion of the generalized regression model and specifically the Poisson regression). 272 TE 216 217 3193.12 RR EC 215 172.87 Please cite this article as: Higgins, H. N., & Nandram, B., Monetary unit sampling: Improving estimation of the total audit error, Advances in Accounting, incorporating Advances in International Accounting (2009), doi:10.1016/j.adiac.2009.06.001 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 266 267 268 269 270 273 274 275 ARTICLE IN PRESS H.N. Higgins, B. Nandram / Advances in Accounting, incorporating Advances in International Accounting xxx (2009) xxx–xxx 280 281 282 283 284 285 286 287 288 289 290 291 di jλi ind e Poissonðλi bi Þ ð7Þ ln λi = ðβ0 + r0i Þ + ðβ1 + r1i Þπi 292 293 294 295 296 297 298 where i = 1,2, …, N, r0i and r1i are random perturbations and are independent and identically distributed with mean 0 and finite variances, and λi is the error rate for a dollar, or equivalently, the probability that a dollar is in error. Now we have two sets of random coefficients β0 + r0i and β1 + r1i. Note that the model with r0i = r1i = 0, i = 1,2, …, N, is di jλi ind e Poissonðλi bi Þ ln λi = β0 + β1 πi : 299 300 305 306 di jλi ind e Poissonðbi e 310 311 β0 + β1 πi Þ: ð9Þ Centering allows us to use this simpler model to study the original, more complex model. Because the dis are independent, the loglikelihood function is n CO R 307 308 309 n n i=1 i=1 β0 + β1 πi f ðβ0 ; β1 Þ = β0 ∑di + β1 ∑ πi di − ∑ bi e i=1 312 313 316 317 318 319 320 321 : n i=1 ðhÞ =M = E h=1 ð13Þ and the standard error is 329 330 sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi M S= ∑ ðE ðhÞ i=1 P 2 − E Þ = ðM−1Þ: ð14Þ To obtain a 95% confidence interval for the total overstatement error, we order the E(h) from smallest to largest and pick the 250th and 9750th value from the lower and upper end of the confidence interval. The 95% upper confidence bound is the 9500th value. N ∑ d˜i 331 332 333 334 335 336 4.2. Zero-Inflation Poisson regression 337 We now develop the Poisson regression model to count data with many zero values. This is our major innovation to address MUS estimation errors. We assume that with probability θ the only possible observation is 0 (i.e. pure zero), and with probability1 − θ, a Poisson (λ) random variable is observed. In other words, we treat errors as nearly impossible, but allow errors to occur according to a Poisson (λ) distribution. This treatment is termed Zero-Inflation Poisson (ZIP). For a pioneering description of ZIP regression, see Lambert (1992). Both the probability θ of the perfect zero error state and the mean number of errors λ in the imperfect state may depend on covariates. Pertinent covariates, such as trade relationships, credit terms, product prices, etc… can lead to improved precision in statistical inference. For a discussion of how to introduce covariates, see Nandram et al. (2000), for example. Sometimes θ and λ are unrelated; other times θ is a simple function of λ. In either case, ZIP regression models are easy to fit, and the maximum likelihood estimates are approximately normal in large samples of at least 25 (Lambert, 1992). As in the ordinary Poisson model in Section 4.1, our model is Eq. (7). To increase the probabilities of zero errors, we use the following distribution for di, 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 β 0 + β 1 πi ð10Þ We use maximum likelihood estimate in the centered model to obtain MLEs. Preliminary estimations are possible at this step, however they do not benefit from the zero inflation process, so they are less adjusted for populations with a large number of zero errors. We show in Appendix A how to obtain 10,000 copies of the overstatement error. We use two estimators: the predicted estimator and the projected estimator. The two estimators are defined as: Predicted: ∑ di + M ∑E pðdi = 0Þ = θ + ð1−θÞe−bi e UN 314 315 from Appendix A, and M = 10,000, h = 1,2, …, M. An estimate of the 327 overstatement error is 328 CT 303 304 This is another important feature of our method because observations are treated according to their selection probabilities π, and so sample and non-sample observations are not assumed to be the same. The first model is centered on the second model. Centering is achieved by taking r0i = r1i = 0, so that RE 301 302 be the h-th copy of the 10,000 copies obtained 326 i=1 ED ð8Þ ðhÞ Let EðhÞ = ∑ di OO F 278 279 We use random coefficients, instead of fixed coefficients as in a standard linear regression, to treat each account as having its own set of regression coefficients. By allowing random coefficients, we can model the data more flexibly and therefore can achieve more accuracy. This is an important feature of our method to incorporate random effects. Using random coefficients also allows us to build confidence intervals and upper confidence bounds. We approximate the distribution of the random coefficients by using a standard mode– Hessian normal approximation (Gelman, Carlin, Stern & Rubin, 2004), which allows us to draw samples of the regression coefficients. This permits us to make copies of the total errors, thus we get the distribution of the total error. In a sample of 10,000 we obtain the 250th and 9750th values to form the lower and upper ends of the 95% confidence interval. To consider each dollar in an account is either in error or not, our model is, 5 PR 276 277 N ð11Þ i=n + 1 ð15Þ pðdi ≠0Þ = ð1−θÞ β0 + β1 πi di −bi eβ0 ðbi e Þ e di ! + β 1 πi where di = 1,2,3, …. The random effect corresponding to θ is θ + r21.We do not treat the dis equally. Because most of the errors in the sample are zero, different treatments of zero-valued dis and non-zero dis are desired to give more appropriate weights to the zero values. We center the ZIP model by taking r0i = r1i = r2i = 0, i = 1, 2, …, N, similar to Eqs. (8) and (9). Since the dis are independent, we consider their joint probability density function: 359 360 361 362 363 364 365 366 367 368 322 323 N Projected: ∑d˜i i=1 ð12Þ Lðθ; β0 ; β1 Þ = ∏ di = 0 324 325 where dĩ is the predicted value of di. n β −b e 0 θ + ð1−θÞe i + β1 πi o ( ∏ dt N 0 ð1−θÞ ðbi e β0 + β1 πi β0 + β1 πi di −bie Þ e di ! ) : ð16Þ 369 370 Please cite this article as: Higgins, H. N., & Nandram, B., Monetary unit sampling: Improving estimation of the total audit error, Advances in Accounting, incorporating Advances in International Accounting (2009), doi:10.1016/j.adiac.2009.06.001 ARTICLE IN PRESS Thus the log-likelihood function is n −bieβ0 Δðθ; β0 ; β1 Þ = ∑ ln½θ + ð1−θÞe + β1 πi π=1 M + ∑ l=π + 1 372 373 374 375 376 + ðN−nÞ lnð1−θÞ ð17Þ M di ðβ0 + β1 πi Þ− ∑ bie β0 + β1 πi l=π + 1 : We use the Expectation Maximization (EM) algorithm to maximize the function over θ and ß (see Appendix B). The values we obtain from the above procedures are: θ̃ = 0:80 β̃0 = −3:25 β̃1 = −14:36: 378 377 379 380 396 5. Simulations 397 We perform three simulations to assess the reliability and efficiency of our ZIP upper bound, using the following definitions by Swinamer et al. (2004): a 95% upper confidence bound is reliable if, when used repeatedly, the bound exceeds the true error 95% of the time. Efficiency measures the size of the bound: the smaller the bound is, the more efficient it is said to be. We compare the reliability and efficiency of our ZIP method with the Stringer bound, because this bound is used the most extensively by accounting auditors as discussed in previous sections. We also compare our ZIP method with the Multinomial-Dirichlet bound, because this bound demonstrates the best reliability for a variety of populations (Tsui et al., 1985). More MUS bounds exist but they have relatively more reliability problems and they all have pros and cons (Grimlund & Felix, 1987; Swinamer et al., 2004). 385 386 387 388 389 390 391 392 393 394 398 399 400 401 402 403 404 405 406 407 408 409 410 t3:1 RR EC 383 384 Table 3 Simulation results. t3:2 t3:3 Reliability 1 UN t3:4 t3:5 t3:6 t3:7 t3:8 t3:9 t3:10 t3:11 t3:12 t3:13 t3:14 t3:15 CO 381 382 TE 395 Our estimations are based on two estimators, the predicted estimator and the projected estimator as defined in Eqs. (11) and (12). We estimate the mean, standard error, confidence interval, and upper bound using in a manner as described in Eqs. (13) and (14). We show in Appendix B how to obtain 10,000 copies of the overstatement error. The predicted estimator has a mean of $3272, a standard error of $1037, a 95% confidence interval of [$1487; $5336], and a 95% upper confidence bound of $4848. The projected estimator has a mean of $3435, a standard error of $1463, a 95% confidence interval of [$931; $6107], and a 95% upper confidence bound of $5444. As per design, we argue our ZIP method has advantage because it addresses the selection bias in MUS towards selecting larger accounts, it handles populations with a large number of zero values, and it handles zero versus non-zero values explicitly separately. As a result of this design, ZIP should be better suited than common estimation practice for accounting populations. Simulation 1 consists of three steps. First, we fill in the missing audit value for the 87 − 20 = 67 accounts in our data set. We use the bivariate Parzen–Rosenblut kernel density estimator, an independent non-parametric model (Silverman et al., 1986) to fill in the audit values. To model errors, we formulate the missing audit values in the range 0.90 book value b audit value b book value. We create the scenario where 80% of the book values and the audit values are in perfect agreement. This population is kept fixed throughout. Because we generate all the necessary audit values, we have the entire population and the true total population error. Second, we draw 1000 PPS samples (probabilities of selection are proportional to the book values) without replacement of size 20 from the single populations we construct in the first step. Third, we fit the ZIP regression model in exactly the same manner we discussed in Section 3 for the observed data. We compute 90% confidence interval for each fixed population, whose upper end provides a 95% upper confidence bound. For comparison, we compute the same for the Stringer bound and the Multinomial-Dirichlet bound. The results are shown in Table 3. Table 3, Columns 2–6 show the reliability results, specifically the frequency in which the bounds exceed the true population error. In Simulation 1, the ZIP bound exceeds the true population error 89.5% of the times, whereas it is supposed to exceed 95% of the times. In contrast, the Stringer bound exceeds the true error 100% of the times, while this frequency is 96% for the Multinomial-Dirichlet bound. These results demonstrate ZIP's lower reliability than the theoretical level and the extra conservatism of the Stringer bound. To report on the efficiency, Columns 5 and 6 show the frequency in which the ZIP bound is larger (more inefficient) than the Stringer bound and the Multinomial-Dirichlet bound. In Simulation 1, these frequency numbers are very close to zero (0.7% and 9.6%, respectively), indicating the ZIP bound is almost always smaller, or more efficient. To further shed light on the size of the other bounds relative to the ZIP bound, Columns 7 and 8 show the ratio of the differential size over the ZIP bound. On average, this ratio is about 13 times for the Stringer bound, and 25.8% for the Multinomial-Dirichlet bound. These results demonstrate the far greater efficiency of the ZIP bound compared to the other two. Simulation 2 is similar to the first, except that the audit values are generated by the ZIP model. The generated data have similar characteristics to accounting data in the sense that they come from a distribution that presumes a very large number of zero errors. From Column 2, the ZIP bound is reliable as its 95% upper bound exceeds the true error value 96.8% of the times. From Columns 5 and 6, the frequency in which the ZIP bound is larger than the other two bounds is very small (0.8% and 3%, respectively), denoting ZIP's greater efficiency. Simulation 3 is similar to the first two, except that the audit values are generated to have error distributions identical to Population 4 in RO OF 371 H.N. Higgins, B. Nandram / Advances in Accounting, incorporating Advances in International Accounting xxx (2009) xxx–xxx DP 6 Simulation 1 Data from Parzen–Rosenblut model Simulation 2 Data from ZIP model Simulation 3 Data from Neter Loebbecke population 4 |{z} Efficiency 2 3 4 5 6 7 8 ZNT SNT MNT ZNS ZNM (S − Z) / Z (M − Z) / Z 0.895 1.000 0.960 0.007 0.096 12.989 0.258 0.968 1.000 1.000 0.008 0.030 0.342 0.188 0.984 1.000 1.000 0.009 0.044 0.343 0.182 |{z} Proportion of 1000 simulations |{z} Average over 1000 simulations Z: ZIP bound, S: Stringer bound, M: Multinomial-Dirichlet bound, T: True value. The rows show the results from three separate simulations. Columns 2–6 show the frequency in which one bound is larger than the true value or another bound. Columns 7 and 8 show the relative size of the other bounds compared to the ZIP bound. Please cite this article as: Higgins, H. N., & Nandram, B., Monetary unit sampling: Improving estimation of the total audit error, Advances in Accounting, incorporating Advances in International Accounting (2009), doi:10.1016/j.adiac.2009.06.001 411 412 413 414 Q3 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 ARTICLE IN PRESS H.N. Higgins, B. Nandram / Advances in Accounting, incorporating Advances in International Accounting xxx (2009) xxx–xxx 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 An estimate of the covariance matrix of β0̂ and β1̂ is the negative 514 inverse Hessian matrix. The Hessian matrix is 515 0 n β + β1 πi bi e 0 B i∑ B =1 B n @ β ∑ bi πi e 0 β0 + β1 πi i=1 + β1 πi i=1 n 2 β0 + β1 πi ∑ bi πi e H −1 = c a2 a1 c where 519 520 n 2 β0 + β1 πi ∑ bi πi e i=1 a1 = n β0 + β1 πi ∑ bi e i=1 n 2 β0 + β1 πi ∑ bi πi e i=1 n 494 495 496 497 498 499 500 501 Q4 502 7. Uncited reference Silverman, 1986 504 Acknowledgement 505 506 507 The authors acknowledge the helpful comments from two anonymous reviewers and the Journal's associate editor. Any errors are the authors' responsibility. 508 Appendix A 509 UN 503 For the centered model, the log-likelihood function is n n n i=1 i=1 i=1 f ðβ0 ; β1 Þ = β0 ∑ di + β1 ∑ πi di − ∑ bi e 510 511 512 513 β0 + β1 πi : We use a two-dimensional Newton's method to maximize it over β0 and β1. Details are omitted. Þ 521 522 β0 + β1 πi n β0 + β1 πi 2 ð ∑ bi πi e β0 + β1 πi Þ − ∑ bi e i=1 ∑ bi e n 2 β0 + β1 πi ∑ bi πi e i=1 523 524 β0 + β1 πi i=1 a2 = n β0 + β1 πi ∑ bi e ED 492 493 β0 + β1 πi 2 i=1 ∑ bi πi e n i=1 n 2 β0 + β1 πi ∑ bi πi e i=1 n β0 + β1 πi 2 −ð ∑ bi πi e : Þ i=1 525 526 527 First, we take β0 + r0i β1 + r0i CT 490 491 RE 488 489 We propose a method that improves upon the common MUS estimation approach. MUS estimation as commonly practiced by accounting auditors does not explicitly recognize that a total population of account errors typically consists of multiple distinct distributions. As a result, this common approach often leads to very large error estimations and very conservative auditor's decisions on the fairness of client financial statement. For conservatism auditors should want large estimations of errors and upper bounds, however estimations under common practice are too conservative, and excessive conservatism has its own problems. Our method, based on Zero-Inflation Poisson distribution, addresses the above shortcoming. We discuss our method and show that for data similar to accounting populations, our bound is reliable and more efficient than common MUS practice, so we would recommend our bound to accounting auditors. For other populations, our method may be slightly less reliable than theoretically desired, and future research should seek to improve the method for these populations. n −ð ∑ bi πi e i=1 c= 6. Conclusion CO R 486 487 C C C A whose elements are all positive values. And the negative inverse 516 517 Hessian matrix is 518 n 485 1 i=1 i=1 484 n ∑ bi πi e OO F 461 462 Neter and Loebbecke (1975). This population consists of 4033 accounts receivables of a large manufacturer. Neter and Loebbecke (1975) have made major contributions in analyzing the error characteristics of accounting populations, and their populations are often used to represent accounting populations (Chan, 1988; Smieliauskas, 1986a; Frost & Tamura, 1982). From Column 2, the ZIP bound is reliable as its 95% upper bound exceeds the true error value 98.4% of the times. From Columns 5 and 6, the frequency in which the ZIP bound is larger than the other two bounds is very small (0.9% and 4.4%, respectively), denoting ZIP's greater efficiency. Columns 7–8 show the magnitude of the Stringer bound is about 34% larger and the Multinomial-Dirichlet bound about 18% larger than the ZIP bound, also denoting ZIP's greater efficiency. Summing the simulation studies, results based on a general population without distinct distributions reveal our ZIP method to be very efficient, although with slightly lower reliability than the theoretical level. However, based on populations similar to or taken from accounting error populations, the ZIP bound is reliable and more efficient than the Stringer and Multinomial-Dirichlet bounds. The ZIP overall performance seems very good, in light of Swinamer et al. (2004), who use extensive simulations on both real and simulated data to compare 14 bounds, and find no single method to be superior in terms of both reliability and efficiency. Given the trade-off between reliability and efficiency in the state-of-the-art, our ZIP method is very promising as an MUS estimation technique. PR 459 460 7 β̃0 e Normalf β̃ 1 ind ! a1 ; c c g a2 where a1, a2, and c are the elements in the previous matrix H− 1. This is a standard approximation and is typically used in generalized regression model. Next, we show how to construct the confidence interval. We can see that ln λi = (β0 + r0i) + (β1 + r1i) πi also follows the normal distribution and we can obtain its mean and variance easily, 528 529 530 531 532 533 534 2 ln λi iid e Normalðβ̃0 + πi β̃1 ; 1 = a1 + πi = a2 + 2πi = cÞ: As we discussed earlier, we know that di follows the Poisson β0 + β1 πi distribution di j λi ind Þ. This is equivalent to di jλi ind e Poissonðbi e e lnλi Poissonðe bi Þ. We can then draw Z1, Z2, …, ZN, N independent standard normal random variables, and compute lnðλi Þ = ðβ̃0 + πi β̃1 Þ + Zi 535 536 537 538 539 540 541 qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 = a1 + π2i = a2 + 2πi = c where i = 1, 2, …, N. Thus, we can now draw di as a Poisson random variable with parameter λ̃i, i = 1, 2, …, N. So these random draws of λ̃1, λ̃2, …, λ̃N are given by λ̃i = expfðβ̃0 + πi β̃1 Þ + Zi 542 543 544 545 546 qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 = a1 + π2i = a2 + 2πi = cg where i = 1, 2, …, N. We repeat this process 10,000 times and obtain 547 548 N 10,000 values of the overstatement error ∑ di. i=1 Please cite this article as: Higgins, H. N., & Nandram, B., Monetary unit sampling: Improving estimation of the total audit error, Advances in Accounting, incorporating Advances in International Accounting (2009), doi:10.1016/j.adiac.2009.06.001 549 ARTICLE IN PRESS 8 H.N. Higgins, B. Nandram / Advances in Accounting, incorporating Advances in International Accounting xxx (2009) xxx–xxx 550 Appendix B 551 A Zero-Inflated Poisson (ZIP) regression model Now the joint probability density function becomes n β −b e 0 Lðθ; β0 ; β1 Þ = ∏ θ + ð1−θÞe i + β1 πi D 556 557 559 558 561 560 562 563 564 565 566 567 + β1 πi di −bi e D Þ e di ! β0 + β1 πi ) – where D is the set of accounts with where di = 0; D is the set of accounts with where di ≠ 0. To apply the Expectation Maximization (EM) algorithm, we introduce a new variable Zi, which is defined as pðdi = 0jZi = 1Þ = 1 N pðdi ≠0 jZi = 1Þ = 0: ∑di . i=1 We can obtain the Hessian matrix of θ ̂, β̂0,1 to maximize the likelihood function; it is more convenient to use the EM algorithm. By EM algorithm, optimizing the joint probability density function above is equivalent to optimizing the following quantity: ˜ where β0 + β1 πi + β1 πi Þ−bi e and the MLE of β0 and β1 can be obtained by Newton's method in Appendix A. Since θ only exists in the first two terms, we separate this product into two parts, the first two factors and the last two factors. We differentiate the log-likelihood function of g(θ, Z ), set it equal to zero ˜ and solve for θ. Thus we have ∑ Zi n−∑ Zi gðθ; Z Þ = θ D ð1−θÞ D ˜ D ∂ lnðgðθ; Z ÞÞ = ∂θ ˜ D ∑ Zi D θ − ðn−∑ Zi Þ D =0 1−θ UN Therefore, CO lnðgðθ; Z ÞÞ = ∑ Zi lnθ + ðn−∑ Zi Þ lnð1−θÞ ˜ θ = ∑ Zi = n D ! θ β0 + β1 πi θ + ð1−θÞe−bi e Zi jðθ; β0 ; β1 Þ ind e Bernoulli 582 581 i= 1; 2; …; n ðB:1Þ And the conditional expectation of Zi, EfZi jθ; β0 ; β1 g = θ β 0 + β 1 πi θ + ð1−θÞe−bi e : ðB:2Þ 584 583 585 586 587 n i=1 N ∑ i=n + 1 di to 600 First iteration, we use θ = 0.5 as a random starting value. For β0, β1, we use the MLE estimates of the regression coefficients as the starting values, i.e., β̃0 = −5.71, β1̃ = − 0.22. 603 References 604 Akresh, A. D., & Tatum, K. W. (1988). Audit sampling — Dealing with the problems. The saga of SAS No. 39 and how firms are handling its requirements. Journal of Accountancy, 166(6), 58−64. American Institute of Certified Public Accountants (AICPA). (1981). Statement on Auditing Standards No. 39, Audit Sampling. New York: AICPA. American Institute of Certified Public Accountants (AICPA). (2006). Statement on Auditing Standarrds No. 111, Amendment to Statement on Auditing Standards No. 39, Audit Sampling. New York: AICPA. American Institute of Certified Public Accountants (AICPA). (2008). Audit Guide: Audit Sampling. New York: AICPA. Annulli, T. J., Mulrow, J., & Anziano, C. R. (2000). The Wisconsin audit sampling study. Corporate Business Taxation Monthly (pp. 19−30). Bickel, P. J. (1992). Inference and auditing: The Stringer bound. International Statistical Review, 60(2), 197−209. Burgstahler, D., & Jiambalvo, J. (1986). Sample error characteristics and projection of error to audit populations. Accounting Review, 61(2), 233−248. Chan, K. H. (1988). Estimating accounting errors in audit sampling: Extensions and empirical tests of a decomposition approach. Journal of Accounting, Auditing & Finance, 11(2), 153−161. Dworin, L., & Grimlund, R. A. (1986). Dollar-unit sampling: A comparison of the quasiBayesian and moment bounds. Accounting Review, 61(1), 36−58. Elder, R. J., & Allen, R. D. (1998). An empirical investigation of the auditor's decision to project errors. Auditing: A Journal of Practice & Theory, 17(2), 71−87. Felix, W. L., Jr., Leslie, D. A., & Neter, J. (1981). University of Georgia Center for Audit Research Monetary-Unit Sampling Conference, March 24, 1981. Auditing: A Journal of Practice & Theory, 1(2), 92−103. Frost, P., & Tamura, H. (1982). Jacknifed ratio estimation in statistical auditing. Journal of Accounting Research, 20(1), 103−120. Gafford, W. W., & Carmichael, D. R. (1984). Materiality, audit risk and sampling: A nutsand-bolts approach (part two). Journal of Accountancy, 158(4–5), 109−118. Gelman, A., Carlin, J. B., Stern, H. S., & Rubin, D. B. (2004). Bayesian data analysis, 2nd edition, New York: Chapman & Hall. Grimlund, R. A., & Felix, D. W., Jr. (1987). Simulation evidence and analysis of alternative methods of evaluating dollar-unit samples. Accounting Review, 62(3), 455−480. Guy, D. M., & Carmichael, D. R. (1986). Audit sampling: An introduction to statistical sampling in auditing, 2nd edition, New York: John Wiley & Sons Inc. Hall, T. W., Hunton, J. E., & Pierce, B. J. (2000). The use of and selection biases associated with nonstatistical sampling in auditing. Behavioral Research in Accounting, 12, 231−255. Hall, T. W., Hunton, J. E., & Pierce, B. J. (2002). Sampling practices of auditors in public accounting, industry, and government. Accounting Horizons, 16(2), 125−136. Hansen, S. C. (1993). Strategic sampling, physical units sampling, and dollar units sampling. Accounting Review, 68(2), 323−345. Hitzig, N. (1995). Audit sampling: A survey of current practice. CPA Journal, 65(7), 54−58. Kaplan, R. S. (1973). Statistical sampling in auditing with auxiliary information estimators. Journal of Accounting Research, 11(2), 238−258. Kaplan, R. S. (1975). Sample size computations for dollar-unit sampling. Journal of Accounting Research, 13(3), 126−133. Lambert, D. (1992). Zero-inflated Poisson regression, with an application to defects in manufacturing. Technometrics, 34(1), 1−14. Leitch, R. A., Neter, J., Plante, R., & Sinha, P. (1982). Modified multinomial bounds for larger number of errors in audits. Accounting Review, 57(2), 384−400. Lohr, S. L. (1999). Sampling: Design and analysis. New York: Duxbury Press. Menzefricke, U. (1984). Using decision theory for planning audit sample size with dollar unit sampling. Journal of Accounting Research, 22(2), 570−587. 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 RR EC ˜ N i=1 TE β0 + β1 πi ∑½d ðβ −∑ð1−Zi Þbi e i 0 D eD hðβ0 ; β1 ; Z Þ = e 580 579 595 596 obtain D for the predicted or the projected estimator. We repeat this 601 process 10,000 times and obtain 10,000 values of the overstatement error 602 ˜ 578 80 1 9 1 > > θ + r2i < β̃0 = ind −1 C @ β0 + r0i A e Normal B : @ β̃1 A; −H > > : ; β1 + r1i θ̃ di from Poisson(biλi). Then we calculate ∑ di and ∑ di + 571 570 576 577 594 Thus, we have ∑ Zi n−∑ Zi gðθ; Z Þ = θ D ð1−θÞ D 574 575 592 593 Zi = 1; if di is pure zero; Zi = 0; else ˜ 573 572 590 591 We draw a random sample of (β0 +r0i, β1 +r1i, θ+r2i) from the 597 normal density i=1,2,⋯, N. Then we construct λi as λi =eβ0 + r01 + (β1 +r1i)πi 598 and we have Zi ~Bernoulli (θ+r2i). Now, if Zi =1, di =0; if Zi =0, we draw 599 gðθ; Z Þhðβ0 ; β1 ; Z Þ 569 568 588 589 0 DP 553 554 555 ( o ðb eβ0 ∏ ð1−θÞ i RO OF 552 After we get all the E{Zi|θ,β0,β1} for the sample using Eq. (B.2), we can optimize h(β0, β1, Zi) and calculate a new θ following Eq. (B.1). Then we can start our second iteration. The MLE of β0, β1 and the new θ obtained will be used to calculate new Zis. After that, we optimize h(β0, β1, Zi) and calculate a new θ again. These values will be used in the third iteration. We repeat this process until θ, β0, β1 all converge. Letting H be the Hessian matrix, too cumbersome to present, we take Please cite this article as: Higgins, H. N., & Nandram, B., Monetary unit sampling: Improving estimation of the total audit error, Advances in Accounting, incorporating Advances in International Accounting (2009), doi:10.1016/j.adiac.2009.06.001 ARTICLE IN PRESS H.N. Higgins, B. Nandram / Advances in Accounting, incorporating Advances in International Accounting xxx (2009) xxx–xxx 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 Messier, W. S., Jr., Kachelmeier, S. J., & Jensen, K. L. (2001). An experimental assessment of recent professional development in nonstatistical audit sampling guidance. Auditing: A Journal of Practice & Theory, 20(1), 81−96. Nandram, B., Sedransk, J., & Pickle, L. (2000). Bayesian analysis and mapping of mortality rates for chronic obstructive pulmonary disease. Journal of the American Statistical Association, 95(452), 1110−1118. Nelson, M. K. (1995). Strategies of auditors: Evaluation of sample results. Auditing: A Journal of Practice & Theory, 14(1), 34−49. Neter, J., & Loebbecke, J. K. (1975). Behavior of major statistical estimators in sampling accounting populations — An empirical study. New York: AICPA. Neter, J., Leitch, R. A., & Fienberg, S. E. (1978). Dollar unit sampling: Multinomial bounds for total overstatement and understatement errors. Accounting Review, 53(1), 77−94. Reneau, J. H. (1978). CAV bounds in dollar unit sampling: Some simulation results. Accounting Review, 53(3), 669−680. Schwartz, D. A. (1997). Audit sampling — A practical approach. CPA Journal, 67(2), 56−60. 9 Silverman, B. W. (1986). Density estimation for statistics and data analysis. New York: Chapman and Hall. Smieliauskas, W. (1986). A simulation analysis of the power characteristics of some popular estimators under different risk and materiality levels. Journal of Accounting Research, 24(1), 217−230. Smieliauskas, W. (1986). Control of sampling risks in auditing. Contemporary Accounting Research, 3(1), 102−124. Stringer, K. W. (1963). Practical aspects of statistical sampling in auditing. Proceedings of the Business and Economic Statistics Section, American Statistical Association (pp. 404−411). Swinamer, K., Lesperance, M. L., & Will, H. (2004). Optimal bounds used in dollar unit sampling: A comparison of reliability and efficiency. Communications in Statistics — Simulation and Computation, 33(1), 109−143. Tsui, K. W., Matsumura, E. M., & Tsui, K. L. (1985). Multinomial-Dirichlet bounds for dollar-unit sampling in auditing. Accounting Review, 60(1), 76−97. Yancey, W. (2002). Statistical sampling in sales and use tax audits. Chicago: CCH Incorporated. UN CO R RE CT ED PR OO F 689 Please cite this article as: Higgins, H. N., & Nandram, B., Monetary unit sampling: Improving estimation of the total audit error, Advances in Accounting, incorporating Advances in International Accounting (2009), doi:10.1016/j.adiac.2009.06.001 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688