Professional Skepticism and Auditors' Workpaper Review Kathy Hurtt University of Wisconsin Martha Eining and David Plumlee University of Utah Draft version February, 2002 Please do not quote without permission of the authors. Comments are welcome. We thank the workshop participants at Arizona State University, University of Utah, University of Wisconsin and Bentley College for their comments on earlier versions of this paper. Professional Skepticism and Auditors' Workpaper Review 1 Professional Skepticism and Auditors' Workpaper Review Review of subordinates' workpapers by more experienced auditors consumes a significant portion of the effort on an audit engagement (Bamber and Bylinski 1987). Research into the behavior of workpaper reviewers shows that experience affects their ability to detect certain kinds of errors (e.g., Ramsay 1994, Bamber and Ramsay 1997, 2000, Harding and Trotman 1999). Audit risk and information importance are situational factors positively associated with the accuracy of an auditor's memory for evidence contained in workpapers (Sprinkle and Tubbs 1998). Beyond experience and the situational factors investigated by Sprinkle and Tubbs (1998), our understanding of the workpaper review process remains remarkably incomplete. One unexamined potential influence on the review process can be found in the audit standards, which state that, "(s)ince evidence is gathered and evaluated throughout the audit, professional skepticism should be exercised throughout the audit process" (AU ' 230.7-9). Thus, the due care requirement that auditors exercise professional skepticism is a critical mandate that should extend to auditors' workpaper review. This paper examines the relationship between the degree of skepticism possessed by an auditor and four specific behaviors exhibited during workpaper review. The audit profession's institutionalization of workpaper reviewing arises from the requirement that assistants be adequately supervised (AU ' 150.02). However, from a practical perspective, the review process is time consuming and expensive (Bamber and Bylinski 1987, Asare and McDaniel 1996), and finding ways of producing efficiency gains in workpaper reviewing are likely be welcomed by the auditing profession. Early studies of the review process (Trotman and Yetton 1985, Trotman 1985) find no performance difference between interacting groups comprised of two senior auditors and groups comprised of a senior and a 2 manager. Ramsay (1994) finds that seniors detect mechanical errors better than managers do, while the reverse is true for conceptual errors. A similar relationship exists between staff and senior auditors, where staff auditors are better at detecting mechanical errors (Harding and Trotman 1999). Other studies show that combining the reviews of managers and seniors results in more effective reviews than either seniors or managers alone (Bamber and Ramsey 1997), and composite teams consisting of a senior and a staff auditor out perform pairs of either staff or senior auditors (Harding and Trotman 1999). While this research generally focuses on the efficiency and effectiveness gains that could result from specialization within the review process, significant other influences, remain unexamined. This paper addresses one such influence: the reviewer’s professional skepticism. In addition to the due care requirement for auditors to be skeptical, Statement on Auditing Standards (SAS) No. 53 (AICPA 1988c), No. 57 (AICPA 1988b), No. 67 (AICPA 1992), and No. 82 (AICPA 1997a) reinforce the necessity for auditors to exhibit professional skepticism in a variety of situations and on different types of engagements. However, there is anecdotal evidence that auditors either do not consistently behave skeptically or there is some variation in the degree of skepticism possessed by auditors. Former chief accountants of the Security and Exchange Commission's (SEC) enforcement division enforcement division have indicated that a lack of independence and a lack of professional skepticism were among the primary causes of SEC actions against accountants (The CPA Journal 1996). Research on SEC Enforcement Actions (1987-1997) by Beasley, Carcello and Hermanson concurred with this assessment, indicating that 60% of the enforcement actions were related to a lack of professional skepticism (AICPA 2000). In response to the SEC's concern about the quality of financial audits, the Public Oversight Board established a panel that recommended to audit 3 firms that they provide guidance to their audit personnel about the concept of professional skepticism (POB 2000). Thus, understanding the role of professional skepticism in workpaper review may provide important insight into a recognized problem within the audit profession. A critical step in conducting research involving professional skepticism is a means of identifying individuals who can be characterized as skeptical. Recent research by Hurtt (2001) has resulted in a 30-item psychological scale that measures the degree of skepticism possessed by an individual. In addition, a model linking an auditor's degree of skepticism and certain behaviors has been proposed (Hurtt, Eining and Plumlee 2001). The model of skepticism, developed from surveys of professional accountants as well as literatures from philosophy and psychology, predicts that four behaviors will be associated positively with increasing levels of skepticism: (1) expanded information search, (2) increased identification of contradictions, (3) increased generation of alternatives and (4) increased examination of information from and about people. Our study tests the assertion that auditors with high levels of skepticism will exhibit these behaviors when reviewing workpapers. The AICPA accepts the position that professional skepticism results in behavioral differences. SAS 53 (AICPA 1988c), one of the pronouncements intended to reduce the "expectation gap" between the general public and public accountants, re-emphasized the necessity for an auditor to have an attitude of professional skepticism and described actions that a skeptical auditor would take. An even stronger position on professional skepticism, and one with more behavioral implications, is found in SAS 82 which states that, "Due professional care requires the auditor to exercise professional skepticism - that is, an attitude that includes a questioning mind and a critical assessment of audit evidence" (emphasis added). SAS 82 also modifies SAS 1 to more clearly define and explain professional skepticism, and lists examples 4 of skeptical behavior, including sensitivity in the selection and extent of documentation and increased corroboration of management representations (AICPA 1992b). We test whether more skeptical auditors exhibit these four behaviors when reviewing workpapers as well as the impact of higher audit risk on less skeptical auditors in an on-line workpaper review task administered to auditors from a major international accounting firm. The experiment involved two phases. In the first phase, participants accessed a web site and completed the Skepticism Scale and demographic questions. The second phase consisted of a workpaper review task, which was also performed on-line, where the reviewer was ask to create review notes based on the final review of a subset (debt and inventory) of audit workpapers. Embedded in the workpapers were two risk treatment conditions (high and normal audit risk indicators). Using the Skepticism Scale score to differentiate between those who were more and less skeptical, we find that more skeptical auditors detect more contradictions and demonstrate a stable search pattern across risk situations. In contrast, a high risk audit situation induced less skeptical auditors, to generate more information search queries, spending more time evaluating the information and look at more information from and about people, than they did in the low risk situation. However, this additional work by less skeptical auditors does not result in more contradictions being detected. This suggests that although less skeptical auditors recognize the risk present in a high risk situation, they are not effective in translating that awareness of risk into better detection of contradictions. The remainder of this paper is organized as follows. The next section reviews prior research into both workpaper review and professional skepticism along with the behavioral predictions. The fourth section describes the experimental methods and data collections. The data analysis is in the fifth section with the conclusions and implications in the last section. 5 II. BACKGROUND LITERATURE AND HYPOTHESES Research on the workpaper review process Bamber and Bylinski (1987) document the substantial portion of total audit hours that audit firms devote to the review process. Certain situational factors have been shown to affect the review process. Asare and McDaniel (1996) show that when reviewer's are familiar with preparers they reperform less of the original work. They also find that reviewers who are more familiar with preparers detect more conclusion errors in a complex task relative to routine task, while reviewers were not familiar with preparers are more effective in a the routine task. Other experimental studies of the review process compare judgments between sequential reviewing and interacting groups, and show that, whether performance is measured by consensus or accuracy, the sequential review process was not found to be better than the group performance (Trotman and Yetton 1985, Trotman 1985). Research involving actual review of simulated workpapers explicitly recognizes that the nature of the review task differs between managers and seniors. Ramsey (1994) finds that managers were better at detecting conceptual errors, while seniors were more accurate at detecting mechanical errors. Harding and Trotman (1999) generalize Ramsey's (1994) finding by showing that composite groups with a staff and a senior auditor outperform composite groups with either two seniors or two staff auditors. Extending this research to the question of whether specialization improves reviewer performance, Bamber and Ramsey (1997) compared "all-encompassing" reviews with specialized reviews that focused exclusively on either conceptual or mechanical errors. Consistent with Ramsey they find that managers were more effective at identifying conceptual errors, while seniors were more effective at identifying mechanical errors. However, when ex post teams comprised of both a senior and a manager engage an all-encompassing review, they 6 are more effective than teams performing specialized reviews. Examining review efficiency and reviewer confidence, Bamber and Ramsey (2000) find that specialized reviews are less efficient in terms of time, and seniors performing specialized reviews were more confident when performing specialized reviews, while managers were not. In studies involving reviewer's memory for audit evidence, Moeckel (1990) finds that inexperienced auditors fail to connect related pieces of audit evidence more often than experienced auditors were more likely to mentally reconstruct inconsistent pieces of evidence to form a coherent memory of the evidence. Moeckel and Plumlee (1989) correlate confidence in memory with recognition accuracy and find that, when the mental representations resulted from inference, recognition accuracy and auditor’s confidence in their memories were inversely related, implying that inferred (i.e., internally generated memories) were seen as valid audit evidence. Looking for situational factors that might mitigate misplaced confidence, Sprinkle and Tubbs (1998) show that audit risk and information importance mitigate misplaced confidence. Libby and Trotman (1993) demonstrate that the review process can serve as a countervailing force to the potentially biased recollections of subordinates. Bamber, Bamber and Bylinski (1988) surveyed audit managers using an instrument that focused on four audit areas. They created a taxonomy of eight workpaper review activities. For each activity, participants indicated whether they would engage in that activity and estimated their anticipated time spent reviewing. Bamber et al. find that managers engage in directed search, that is, pursuing related audit evidence rather than analyzing information sequentially. Bamber et al. describe the review process as a stable search process where auditors are sensitive to account materiality and risk, but that does not change significantly their review strategies. They also 7 find that over three-quarters of their subjects read all working papers for each account and perform active inquiry such as thinking of competing explanations performing analytical tests. In summary, research into the workpaper review process finds that it consumes a large portion a firm's total audit hours, and situational factors effect performance. Auditors' level in the firm corresponds to how they mentally view the review task, which, in turn, affects their performance in error detection. Auditors' search through working papers is directed and, typically, thorough. Research into audit workpaper reviewing has not examined the impact of professionally mandated factors such as skepticism. Research on skepticism Much of the skepticism research in accounting (McMillan and White 1993, Shaub and Lawrence 1996, Shaub 1996, Choo and Tan 1998) attempts to measure an individual's skepticism level then examine behavioral differences among individuals with different levels of skepticism. Most of these studies build upon the premise that suspicion is the opposite of trust, and skepticism is synonymous with suspicion. However, it is difficult to find either theoretical or anecdotal support for equating these constructs. Suspicion and skepticism are never equated in the philosophical literature on skepticism, rather skepticism is portrayed as a complex construct, rather than being unidimensional. Also, Hurtt (2001) indicates that practicing accountants did not equate skepticism with suspicion. Several other studies employ a manipulation intended to induce skepticism for a treatment group, then examined the behavioral differences between groups (Peecher 1996, Turner 1997, Shaub and Lawrence 1997). Both Peecher (1996) and Turner (1997) find behavioral differences between the auditors induced to behave skeptically and those in a control condition. Turner finds that auditors in the skepticism-induced condition examined 8 more total items and examined more disconfirmatory items (items that would tend to disprove or discount the client's position) than did the other auditors. Peecher finds that auditors in a skepticism-induced situation were less likely to rely on client-provided explanations. Shaub and Lawrence (1997) manipulate situational variables in eight different scenarios so that part of the information in the scenario either signaled or did not signal a risk of fraud. They ask participants to predict what they would do in each scenario, and find that new hires generally exhibit greater skepticism (defined as increased audit testwork and increased questioning of management and other personnel) than do seniors, managers and partners. These studies demonstrate that auditors behave differently when induced to do so. It appears that auditors respond to either situational cues indicating increased risk of fraud or to explicit instructions to be more skeptical. In contrast to the situational view of previous studies, Hurtt (2001) adopts the view that skepticism is a psychological characteristic of individuals. She develops an understanding of philosophical skepticism as a basis for her multidimensional definition of skepticism, from which she develops an instrument that can be used to measure individuals' level of skepticism. She identifies six subconstructs of skepticism from questioning professional accountants and from the philosophical literature on skepticism: a quest for knowledge (curiosity), a desire to understand people, a questioning nature, a slowness in forming judgments, a reluctance to accept others' claims, and self-confidence. These constructs were used to develop a 30-item scale which demonstrated strong test-retest reliability, indicating that skepticism is a relatively stable individual characteristic. This skepticism scale provides useful measure of individual’s skepticism level. 9 As Hurtt et al. (2001) explain, research into the writings of early philosophers (Annas and Barnes 1985, Bunge 1991, Kurtz 1992, McGinn 1989, Popkin 1979) suggests that skepticism results in four behaviors: increased information search, increased detection of contradictions, increased generation of alternative explanations, expanded scrutiny of information from and about people. Citing Kurtz (1992) they explain that skeptics have a 'questioning nature' characterized by searching and examining, and skeptics are people who persist in their investigations (Annas and Barnes 1985). Hurtt et al. also argue that skeptics are motivated to answer questions that are raised through an expanded information search and are comfortable with uncertainty because of their willingness to suspend judgment. Thus, skeptics exhibit more extensive information searches than less skeptical individuals. The second behavior of skeptics asserted by Hurtt et al. is that skeptics are more adept at discovering contradictions. They attribute this ability of skeptics to their willingness to suspend judgment, making it comfortable cognitively for them to delay reaching conclusions and search more information. The third behavior Hurtt et al. contend skeptics exhibit is constructing alternative hypotheses regarding statements or claims (McGinn 1989). Support for this comes for the following, "Skeptics wish to examine all sides of a question; and for every argument in favor of a thesis, they usually can find one or more arguments opposed to it" (Kurtz 1992, p 22). So, skeptical individuals can be expected to construct alternate interpretations for the information they observe. The fourth behavior Hurtt et al. attribute to skeptics is an interest in information about people. They claim that skeptics want to understand underlying assumptions behind claims and beliefs and that skeptics try to understand people in order to understand the assumptions that were made (Popkin 1979). This reluctance 'to accept others' claims' results in 10 an expanded investigation of information provided by people, since by understanding a person skeptics can understand the perceptions and assumptions made by that person. These philosophical behavioral expectations are also consistent with the audit procedures required by auditing standards. The expanded information search expected of skeptics is consistent with the general standard on fieldwork (SAS 1) that requires an auditor to obtain sufficient competent evidential matter before opining on the financial statements. This mandates that an auditor perform testwork until she or he believes that sufficient evidential matter exists. In addition, both contradiction detection and alternative generation are expected of auditors. For example, SAS 56 (AICPA 1988a) requires auditors to develop expectations about a company's financial statements prior to performing any analytical review procedures. As part of the procedure, auditors are expected to identify and explain (generate alternative explanations for) any unexpected differences. It is only after differences have been identified and alternatives have been generated by the auditor that any inquiry as to management's explanations should be made. Evaluating the explanations provided by management again allows an auditor to detect contradictions in explanations or among analytic review results. In addition, SAS 56 (AICPA 1988a) indicates that an auditor must obtain corroborating evidence for management's assertions. SAS 82 requires that auditors specifically assess management's characteristics by obtaining information about management's abilities, pressures, style and attitude toward internal control. This is consistent with the suggestion from philosophical skepticism that a skeptic will exercise increased scrutiny of information both from and about people. Hypothesis Development Hypothesized effect of skepticism 11 While Peecher (1996) and Turner (1997) find that participants instructed to exercise professional skepticism exhibit skeptical behaviors, we believe that skepticism is also a personality trait. That is, skeptical individuals will exhibit skeptical behaviors across situations. Specifically, we expect that skeptical auditors will exhibit the four behaviors found in the model of skeptical behavior developed by Hurtt et al. (2001): greater information search, better detection of contradictions, greater generation of alternative explanations, and expanded scrutiny of information from and about people. This leads to the following research hypothesis: H1: During work paper review, auditors with higher levels of skepticism will display a greater levels of information search, contradiction detection, alternative generation, and scrutiny of information from and about people than those with lower levels of skepticism. Hypothesized effect of risk on individual’s with lower skepticism levels SAS 82 instructs auditors to be aware of situational cues that indicate a need for a heightened sense of professional skepticism. Shaub and Lawrence (1997) find that auditors responded to situational risk cues embedded in several experimental scenarios. GAAS specifies the need for skeptical behavior in high risk situations, and research indicates that auditors do respond to situational cues. What is more uncertain is whether there will be behavioral differences in the manner that more and less skeptical auditors behave in a "normal" audit situation. Johnson (1978), indicates that although all people are "skeptical" about certain things or in certain situations, not all people are skeptics, nor are skeptics skeptical in all situations. We assume that skeptical auditors maintain a higher level of skeptical behavior at all times, and less skeptical auditors will exhibit skeptical behavior in higher audit risk situations, but they will not exhibit skeptical behaviors when the audit situation is not higher risk. This leads to the following hypotheses: 12 H2: During work paper review, less skeptical auditors in the low risk condition will exhibit lower level of information search, contradiction detection, alternative generation, and scrutiny of information from and about people than either more skeptical auditors in either condition or than less skeptical auditors in the high risk condition. Experimental Methods Experimental design This research employs a randomized block experimental design where participants were categorized as scoring higher or lower on the Skepticism Scale, then assigned to one of two risk treatment groups. The risk treatments result from manipulation of situational cues that indicate either a "normal" or a "high risk" audit condition. The dependent variables are information examined and time spent (to measure information search behavior and scrutiny of information from and about people), contradictions detected (to measure contradiction detection), and alternatives generated (to measure alternative generation). Experimental Task The experiment involved two distinct phases. In the first phase, participants completed the Skepticism Scale and answered demographic questions. The scale and demographic questions were coded into HTML and JavaScript and placed on a web-server where participants answered by accessing a URL for the site. The participants' responses were collected online using a CGI Script. After scoring each participant's responses to the Skepticism Scale, we assigned the participants alternately to either the high or low risk experimental condition based on their ranked skepticism score to ensure balanced cell sizes between the experimental conditions. 13 The second phase of the experiment was administered an average of 13 days after completion of the pretest, and consisted of a workpaper review task with two treatment conditions (high and normal audit risk indicators) embedded in the workpapers. Similar to Shaub and Lawrence (1997), one set of workpapers contains several cues that are consistent with a heightened audit risk level and the other did not have any specific high risk cues. Appendix B lists the differences between the instructions in the two treatment conditions. The participants conducted general information search through "Workpaper" screens. The auditors were expected to sign-off on these screens (these were 'normal' audit workpaper debt and inventory screens) and the auditors had the ability to write review notes for them. In contrast, information from and about people was contained within "People" screens. The "People" screens contained brief biographies on the employees, and there was no ability to sign-off or write review notes on these screens. The experimental task was based on the one used by Moeckel (1990). The written task instructions inform the participants that he or she will be performing the only review of a subset (debt and inventory) of the workpapers, and they ask the reviewer to document any review and their reasons for making the note. For this study, the original workpapers were modified slightly and were coded into HTML and JavaScript and placed on a web-server. The modifications included: 1)expanding the workpapers to include "people" information consisting of an organizational chart and brief biographies of company employees; 2) a "review notes" textbox frame located on the top portion of every screen replacing the review note tablet in the M&P study; 3) tickmarks modified to ones supported by a keyboard; and 4) the addition of separate screens for the "people information." Dates were also changed to make the workpapers current. The original 14 workpapers were designed to contain a number of contradictions and errors and these remained in the online version. Appendix A contains an example of both a contradiction and an error that were in the workpapers. As a result of pilot testing, several changes were made to the workpapers: 1) inclusion of a screen indicating specific inventory testwork that was performed, which replaced physical copies of eight invoices that were included in the original workpapers; 2) modification of the instructions from one long screen to three short screens which are easier to read online; and 3) dividing the permanent file screen from one long to three shorter screens for the same reason. A second pilot test indicated that these changes were effective and gave preliminary indication that participants in the treatment conditions were responding as predicted. Participants completed the task by accessing the online site, examining the workpapers and writing review notes which were captured using a CGI Script. An analysis of the participant's workpaper review notes provides evidence regarding detection of contradictions, detection of errors, and information search queries. In addition, unknown to the participants, the program captured the screens examined, and the amount of time spent on each screen. This information allowed analysis of the information search behavior and the scrutiny of information from and about people. To examine the participant's generation of alternatives, three brief scenarios were presented to the participants at the conclusion of the workpaper review task. Each participant was asked to generate written alternative explanations for each scenario (Table 8), and an analysis of the participant's answers to the scenarios provides evidence regarding alternative generation. Data Analysis and Results 15 Prior to beginning statistical analysis, the alternative generation, identification of contradictions and information search variables were coded by two independent raters who were blind to the treatment conditions and skepticism levels of the participants. Inter-rater reliability was greater than .90 for the coding on contradiction detection, and greater than .85 for alternative generation coding and information search coding. Differences were resolved by the researchers. Both the amount of time spent1 and the number of screens the auditor examined are used to measure general information search and the scrutiny of information from and about people. To obtain time spent and number of screens information, the raw data captured by the CGI Script was summarized by screens and time. An initial review of the time spent indicated that, although participants were instructed to complete the entire review without interruptions, in some cases it appeared that participants had been interrupted. The data was winsorized and all analyses use this winsorized sample data.2 Descriptive statistics for all variables are presented in Table 1. --------------------------------------------------Insert Table 1 about here --------------------------------------------------Effect of skepticism Our initial test was a MANOVA … 1 2 Participants were not aware that time or search pattern was being captured. For example, participants spent (on average) approximately two minutes on workpaper APF-3 (a single text page containing information on client personnel). However, one participant (who wrote no review notes while this workpaper was on his or her screen) spent over 115 minutes before exiting this screen. Congruent with the procedure described by Winer (1971) on how to handle influential outliers, the sample was winsorized at g = 1, where the highest and lowest time values were replaced by the second highest and second lowest time values. 16 with all Table 13 presents the results of a series of ANOVA procedures used to test for the effect of skepticism. Consistent with philosophical predictions, skepticism level was a significant (p<.05, n = 75) predictor of the auditor's detection of contradictions. Contradictions are the type of errors that are most critical for an auditor to identify and include elements such as contradictory or inconsistent client explanations. Increased identification of this type of error should result in more effective audits. Skepticism level was not a significant predictor of increased information search or increased scrutiny of information about people. Neither was it significant in the number or type of alternatives generated.3 4 --------------------------------------------------Insert Table 1 about here --------------------------------------------------Planned comparison with low risk and low skepticism level Table 15 presents the results of a series of ANOVA procedures performed prior to completing the planned comparison between the low skepticism, low risk condition group and the remaining three groups. This analysis examines whether differences exist among the groups. Detection of contradictions remains significant as does the scrutiny of information from and about people, which was measured by the number of people screens examined. --------------------------------------------------Insert Table 1 about here 3 Additional analyses were completed using regressions run with the participant's raw skepticism score. This did not result in any meaningful changes in the results. The number of contradictions detected remained significant (t = 2.666, sig t < .009; adj R2 = .076). 4 Four auditors never examined the trial balance or subsequent event workpapers. This is significant because of the nine contradictions present in the workpaper, six of them have part of the contradictory information on either the trial balance or subsequent event workpapers. Participants who did not review these workpapers did not have the information to detect the majority of the contradictions. Three of the four auditors who did not examine these workpapers were in the high skepticism category. We reanalyzed the data after eliminating these four individuals. The new analysis resulted in only one significant change: the number of mechanical errors detected was different between the two treatment conditions. 17 --------------------------------------------------The results of the planned comparison are shown in Table 16. The number of information search queries written in the review notes is significantly different between the two groups, as is the number of people screens examined and the time spent on these. However, the detection of contradictions or generation of alternatives is not significantly different between the groups. --------------------------------------------------Insert Table 1 about here --------------------------------------------------CONCLUSION This research examines the important issue of whether professional skepticism affects four behaviors in workpaper review. a term widely used in auditing literature but not clearly defined or well understood. The purpose of this study was test the theoretical predictions of skeptical behaviors utilizing the instrument to differentiate between more and less skeptical individuals. Following is a brief discussion of these three areas that highlights overall contributions, areas of potential limitations, and future research. The model predicts that skeptics will exhibit four behaviors: expanded information search, detection of contradictions, generation of alternatives, and scrutiny of information from and about people. Using the Skepticism Scale score obtained from an on-line pretest to differentiate between those who were more and less skeptical, the auditors in this study completed an on-line workpaper review task and then responded to scenarios asking them to generate alternatives. 18 The most significant finding from the experiment is that more skeptical auditors find more contradictions. More skeptical auditors also demonstrate a stable search pattern across risk situations. This is consistent with my expectations that more skeptical auditors maintain a level of skeptical behavior in all situations and do not react to changing risk conditions with radical changes in behavior. In contrast, less skeptical auditors, when placed in a high risk audit situation, respond by generating more information search queries, spending more time evaluating the information and looking at more information from and about people; however, this additional work does not result in more contradictions detected. This suggests that although less skeptical auditors recognize the risk present in a high risk situation, they are not effective in translating that awareness of risk into better results. The expanded information search behavior that was predicted for more skeptical auditors was not evident in the number of screens examined or the time spent on those screens. Upon reflection, the nature of the experimental task, a workpaper review which requires that the participants "sign-off" on each workpaper, created a bias against finding significant differences in search behaviors. Auditors are trained to review and sign each of these workpapers which limited the availability of discretional additional information for expanded information search. The behavioral model predicts that more skeptical auditors will generate more alternative explanations and a greater number of different types of alternative explanations. Neither of these behaviors was evident from the experiment. Once again, the nature of the experimental task, where the participants were answering brief scenarios after completion of a workpaper review, might have biased against finding significant differences. Additionally, the 19 mean time spent on this experiment was over seventy-five minutes, and at the end of that amount of time, many participants might have been too exhausted to give complete attention and provide alternatives to these scenarios. In retrospect, a workpaper review task is not one where a large number of alternatives are generated. On the other hand, auditors performing analytic review procedures are required to generate explanations for variances. Measuring an auditor's skepticism level before he or she performs an analytic review task and using performance on that task to test the model's prediction of alternative generation might be a more appropriate task. This study also contributes by using a new method of delivering the experimental materials to the participants. The experiment conducted for this study was administered entirely on-line using World Wide Web-based technology. The on-line nature of this experiment allowed auditors to complete the workpaper review task at a time and place of their choosing, recreating the most realistic audit environment. This type of experimental design offers one alternative to expensive and time-consuming on-site experiments. 20 References Bamber, E. M. and R. J. Ramsay. 2000. The effects of specialization in audit workpaper review on review efficiency and reviewers' confidence. Auditing: A Journal of Practice and Theory 19 (2):147-157. Bamber, E. M. and R. J. Ramsay. 1997. An investigation of the effects of specialization in audit workpaper review. Contemporary Accounting Research 14 (Fall): 501-513. Harding, N. and K. T. Trotman. 1999. Hierarchical differences in audit workpaper review performance. Contemporary Accounting Research 16 (Winter): 671-684. Libby, R. and K. T. Trotman. 1993. The review process as a control for differential recall of evidence in auditor judgments. Accounting Organizations and Society 18 (August): 559-574. Messier, W. F., and R. M. Tubbs 1994. Recency effects in belief revision: the impact of audit experience and the review process. Auditing: A Journal of Practice and Theory 13 (1):57-72. Moeckel, C. L. 1990. The effect of experience on auditors' memory errors. Journal of Accounting Research 28 (Autumn): 368-387. Ramsay, R. J.. 1994. Senior/manager differences in audit workpaper review performance. Journal of Accounting Research 32 (Spring): 127-135. Sprinkle, G. B. and R. M. Tubbs. 1998. Effects of audit risk and information importance on auditor memory during working paper review. The Accounting Review 73 (October): 475-502 Trotman, K. T. 1985. The review process and the accuracy of auditor judgments. Journal of Accounting Research 23 (Autumn): 740-752. ____, and P. Yetton. 1985. The effect of the review process on auditor judgments. Journal of Accounting Research 23 (Spring): 256-267. 1 Table 1 Subject Demographic Statistics (Mean values are presented first, followed by median values (italicized) with standard deviation in parentheses) Risk Condition Low n=83 Age Experience in Months Skepticism Score Gender (M = Male; Risk Condition High Low Skepticism High Skepticism Low Skepticism High Skepticism (Group1) (Group 2) (Group 3) n=20 n=19 n=22 n=22 28.8a 27.8a 26.6a 28.8a 28.0 28.0 26.0 28.5 (2.9) (2.0) (2.4) (3.1) 62.6 62.8 48.6 59.7 52.0 61.5 47.5 56.0 (20.7) (23.5) (12.9) (34.1) 128.6** 149.6** 128.2** 147.7** 129.0 148.5 129.0 146.0 (6.4) (7.6) (7.2) (7.8) 10 M 13 M 10 M 12 M 9F 9F 12 F 7F F = Female) a (Group 4) 1 no response Significant difference exist between group 3 and groups 1 and 4, p < .05 (t-test) **Significant difference between the high and low skepticism cell means (as planned), p < .001 (t-test) 2 Table 1 Descriptive Statistics (Mean values are presented with standard deviation in parentheses.) Risk Condition Low Risk Condition High Low High Low High Skepticism Skepticism Skepticism Skepticism Information search measures n=19 n=22 n=22 n=20 Workpaper Screens 51.9 52.2 53.6 54.6 (21.5) (24.6) (25.7) (27.1) Count Workpaper Screens Time in minutes Information search queries Contradiction detection measures Contradictions Errors Alternative generation measures Total number of alternatives Different types of alternatives People search measures People Screens Count People Screens Time in minutes 76.6 75.6 74.7 79.3 (34.5) (37.9) (22.1) (25.6) .47 1.9 1.6 2.1 (1.0) (3.1) (2.1) (3.1) n=19 n=18 n=19 n=19 1.2 1.6 1.3 2.5 (1.4) (1.1) (1.2) (1.8) .74 .61 1.1 1.2 (.65) (.98) (.85) (1.07) n=19 n=18 n=19 n=19 2.4 3.1 3.4 2.7 (2.5) (1.5) (2.3) (1.6) 1.8 2.6 2.5 2.3 (1.8) (1.1) (1.5) (1.3) n=19 n=22 n=22 n=22 .8 2.8 4.8 2.5 (1.1) (4.1) (5.7) (3.4) .6 2.2 3.2 1.8 (1.4) (3.5) (4.5) (3.2) 3 Table 3 ANOVA Results: Effect of Skepticism Level Panel A: Information Search (Hypothesis 1a) n=83 Effect SS F-Value Sig. of F Workpaper screens examined 5.271 .009 .926 Workpaper time (in minutes) 63.446 .069 .793 Number of information search queries (n = 74) 17.773 2.916 .092 Panel B: Detection of Contradictions (Hypothesis 1b) n=74 Effect SS Total logic errors detected Total mechanical errors detected F-Value Sig. of F 11.705 5.703 .020 .001 .000 .989 Panel C: Generation of Alternatives (Hypothesis 1c) n=71 Effect SS F-Value Sig. of F Total alternatives generated .005 .001 .972 Total different types of alternatives .917 .128 .510 Panel D: Scrutiny of Information from and about people (Hypothesis 1d) n=83 Effect SS F-Value Sig. of F People screens examined 2.297 .128 .722 People time (in minutes) .00006 .000 .998 4 Table 4 ANOVA Among all Four Groups Panel A: Information Search n=82 Effect SS F-Value Sig. of F Workpaper screens examined 97.457 .053 .984 Workpaper time (in minutes) 249.351 .088 .966 29.486 1.611 .195 Number of information search queries (n=74) Panel B: Detection of Contradictions n=74 Effect SS Total logic errors detected Total mechanical errors detected F-Value Sig. of F 19.523 3.253 .027 3.711 1.529 .214 Panel C: Generation of Alternatives n=71 Effect SS F-Value Sig. of F Total alternatives generated 9.973 .773 .513 Total different types of alternatives 6.155 .988 .403 Panel D: Scrutiny of Information from and about people n=82 Effect SS F-Value Sig. of F People screens examined 172.944 3.547 .018 People time (in minutes) 70.252 2.022 .118 5 Table 16 Planned Comparison between Low Skepticism, Low Risk Condition and the Other Groups Panel A: Information Search Effect t df Significance Workpaper screens examined .243 79 .809 Workpaper time (in minutes) .014 79 .989 2.112 71 .038 Effect t df Total logic errors detected 1.474 71 .145 .853 71 .397 Number of information search queries Panel B: Detection of Contradictions Total mechanical errors detected Significance Panel C: Generation of Alternatives Effect t df Significance Total alternatives generated 1.159 68 .250 Total different types of alternatives 1.585 71 .117 Effect t df People screens examined 2.456 79 .016 People time (in minutes) 2.019 79 .047 Panel D: Scrutiny of Information from and about people Significance 6 Appendix A Contradiction or Error example Contradiction Error Statements in Workpapers Inference or Problem There were strong markets at the end of the prior year. Philips Co. had forecast that strong markets would bear both price and volume increases. Management decided to be very aggressive and to do whatever was necessary to take full advantage of the situation. . . Second, to get in on the price increases it had predicted would result from the stronger market at the end of FY 1997, Philips Co. changed its pricing policies across the board. Sales were down 5.5 %. There are two related reasons for this. . . . Second, Philips decreased its prices across the board in both March and June in order to reduce its inventory buildup (per schedule A1, inventory: $205,500 and $164,040 in 1998 and 1997), and to shore up slackening sales. Were prices increased or decreased? Philips Co. has plants located in Stillwater, New Mexico (also its central office) and Coldton, Minnesota. Philips closed a plant in Sunnyvale, California in December 1997 when its lease on the building expired. This note was retired with the proceeds from the sale of land and the closing of the Coldton plant. Which plant closed? 7 Appendix B ANormal" Audit Risk Instructions AHigh@ Audit Risk Instructions GENERAL BACKGROUND INFORMATION GENERAL BACKGROUND INFORMATION Terry Brandon, who prepared the workpapers you have received, has about one year of experience, including a limited amount of work on the prior year's audit of Philips Co. You have no reason to question Terry's basic competence. CLIENT INFORMATION: The client is Philips Co., a designer and manufacturer of water treatment systems. Your firm has audited Philips Co. for the past seven years, and rendered unqualified opinions in all cases. Terry Brandon, who prepared the workpapers you have received, has about one year of experience with Henderson and Franks, including a limited amount of work on the prior year's audit of Philips Co. You have no reason to question Terry's basic competence, but since Terry was trained by another firm, there will probably be minor stylistic differences between your approaches to the same work. CLIENT INFORMATION: The client, which is new to you, is Philips Co., a designer and manufacturer of water treatment systems. Your firm acquired Philips Co. as a client on 3/31/98 when your firm merged with Henderson & Franks, a well respected regional CPA firm. Henderson & Franks has audited Philips Co. for the past seven years, and rendered unqualified opinions in all cases.