Core Concepts of Information Integrity: A Survey of Practitioners J. E. Boritz School of Accountancy University of Waterloo September 30, 2003 Funding provided by Information Systems Audit and Control Association (ISACA). The views expressed here are solely the author’s and do not necessarily reflect views held by ISACA. Research assistance provided by Malik Datardina. 1 of 36 Abstract This paper reports on a survey of pracitioners that was gathered during two workshops held in Toronto and Chicago in Spring and Summer of 2003, respectively, to gather the following information: 1. Relative Importance of Information Integrity Attributes and Enablers 2. Definition of Information Integrity 3. Definition of Core Attributes of Information Integrity 4. Relationship Between Information Integrity Attributes and Enablers 5. Experience with Information Integrity Impairments for Selected Industries 6. Experience with Information Integrity Impairments for Selected Data Streams 7. Information Integrity Impairments by Stages of Processing 8. Information Integrity Impairments by phases of the System Acquisition/Development Life Cycle 9. Information Integrity Impairments by System Component The participants were experienced professionals with an average of 17 years of works experience and an average of 5 years in their current position. The organizations represented were predominantly small to medium size entities; 3/4 had less than 10,000 employees. The most represented sector was the financial services sector, followed by consulting and healthcare. Information systems was the largest area represented, followed by audit. Half of the participants possessed a CISA certificate, often in combination with other professional certifications. The survey was based on an extensive survey of the literature on data quality and information integrity that led to the development of a framework that is broader that that provided in the widely recognized international control guideline COBIT (ISACA, 2000), but narrower than the concept of information quality discussed in the literature. COBIT is a global standard that is intended for widespread use for internal and external assurance on information technology controls. However, one of the policy recommendations arising from the findings of this study is that the COBIT definition of information integrity be reconsidered. Also, a two-layer framework of core attributes and enablers (identified in this study) should be considered. 2 of 36 A Framework for Information Integrity: A Survey of Practitioners An entity’s information assets are an important component of its intangible assets, which, in turn, constitute more than 80% of an entity’s market value (ITGI, 2001). Today, senior executives’ accountability for the integrity of company financial information is front-page news. It is noteworthy that a CICA publication aimed at Boards of Directors lists data integrity as one of 20 key issues that Directors should be concerned with (CICA, 2002). The impact of information integrity impairments can be far-reaching and costly in money, time, resources, reputation and customers (Betts, 2001; Redman, 1998; Wang, Storey and Firth, 1995). Small mistakes made by the most well-meaning employee can have a catastrophic effect, propagating errors throughout the organization. Therefore, it is both surprising and disconcerting that a 2001 survey (PricewaterhouseCoopers, 2001) found that 75% of respondents had experienced problems as a result of data quality issues, although 75% had also benefited from effective data management actions in the form of reduced processing costs (fewer reconciliations), increased sales due to better customer data and increasing automation of decisions and processes. The survey found a dangerous complacency about data management: 2/3 of Boards do not address it 2/3 place responsibility for it solely on the CIO or IT department 1/2 of CEOs do not see it as a strategic issue 1/3 of respondents believe management does not place enough importance on it only 1/3 are very confident about the quality of their own data and even less are very confident about the quality of others’ data In light of these alarming findings, it is important to define and validate a framework for information integrity that can be used to guide management risk assessments and control deployment and guide assurance providers on the criteria to be addressed by information integrity oriented assurance services. A limitation of today’s control and assurance efforts related to information integrity is that the frameworks of the accounting and auditing professions have focused almost exclusively on financial information. As the focus of information integrity control and assurance efforts expands 3 of 36 to other decision-related information beyond financial information, a need arises for a comprehensive generally accepted definition of information integrity and a control framework linked to such a definition. One well known definition of information integrity provided (ISACA, 2001) defines it by the three attributes of completeness, accuracy and validity. In contrast, other publications identify attributes of information without specifically linking them to the concepts of information integrity (FASB, 1980). Integrity means an unimpaired or unmarred condition - entire correspondence of a representation with an original condition (Webster’s Third New International Dictionary). Applied to information, integrity is the representational faithfulness of the information to the condition or subject matter being represented by the information. Thus, information integrity is a concept that focuses primarily on the reliability of information, but information integrity attributes also play central roles in information relevance and useability, information quality and information value. In other words, the concept of information integrity draws on all of these concepts, but is narrower than information quality, falling in the overlapping area of the three major information quality concepts of relevance, reliability and useability illustrated in Figure 1. Thus, information integrity includes the attributes of accuracy/correctness, validity/authorization, security consistency/comparability, dependability/predictability, auditability/verifiability and credibility/assurance from the reliability concept, completeness, currency/timeliness and granularity/aggregation from the relevance concept and understandability and accessibility/availability from the useability concept. 4 of 36 Figure 1 Relationship Between Information Integrity and Other Information Quality Concepts Reliability Relevance INFORMATION INTEGRITY Useability INFORMATION QUALITY The purpose of this study is to identify and validate a set of core concepts of information integrity to facilitate the development of: 1. comprehensive management approaches for addressing information integrity concerns extending beyond financial statement considerations to operational and managerial information, 2. assurance services to provide assurance about all aspects of information integrity extending beyond assurance on financial statement information, and 3. research on causes of information integrity problems and potential solutions to those problems. To these ends, this his paper reports on a survey of pracitioners that was gathered during two workshops held in Toronto and Chicago in Spring and Summer of 2003, respectively. The purpose of the survey was to validate an information integrity framework that was developed 5 of 36 pursuant to a review of literature conducted under the auspices of the Information Systems Audit and Control Association (ISACA) and summarized in another publication (Boritz, 2003a). Information Integrity Framework Information integrity is defined as the representational faithfulness of information to the subject matter represented. Representational faithfulness is further defined by four core attributes: Complete, Current/ Timely, Accurate/ Correct, Authorized/ Valid. Figure 2 summarizes this framework. It identifies four core attributes of information integrity and seven clusters of related enablers, summarized in Figure 2 and described below. Credible/ Assured Verifiable/ Auditable Dependable/ Predictable Consistent/ Comparable/ Standards Available/ Accessible Secure Core Attributes of Information Integrity Understandable/ Appropriate level of granularity/aggregation Figure 2 Summary of Relationship between Core Concepts of Information Integrity and Enablers Complete E E E E E Current/ Timely E E E E E E E E E E E E E Authorized/ Valid E Accurate/ Correct E Note: E = Enabler of Core Attribute 6 of 36 Core Attributes of Infromation Integrity Accuracy/Correctness For many, the key attribute of representational faithfulness is accuracy, which asserts that the information corresponds to reality (English, 1999) (i.e., what is represented in the information system corresponds to a real world object or event with some dgree of precision). For example, if the database states there are two cars on the lot but there are actually zero cars on the lot, then the database is inaccurate. The concept of information accuracy is also linked to neutrality (lack of bias) in the manner in which subject matter is represented. Completeness Accuracy by itself, however, is insufficient to convey the full dimensionality of the requirements for representational faithfulness. For example, representational faithfulenss requires completeness of information in both time and space. All information necessary to reflect business activity in accordance with established business rules must be captured, processed, stored and reported. Thus, there is a fundamental dependency between completeness and accuracy because the measurement and processing limitations of information processing systems may prevent 100% real-time completeness, especially for subject matter that changes frequently, which, in turn, prevents 100% accuracy. For example, if there are three cars on the lot, two cars in the database, and one car in a receipt transaction that has yet to update the database, then a process that ensured processing completeness would contribute to database accuracy as well. In other words, every discussion of accuracy is also a discussion of completeness and every discusssion of completeness is also a discussion of accuracy. Currency/Timeliness Currency is a form of completeness related to the time dimension of information processing. Information must be current/timely within pre-set definitions of the duration of 7 of 36 time in an information period, the interval between periods, and the acceptable delay from a set cut-off point. Information completeness may be affected by processing delays or real world changes over time with a communsurate impact on information accuracy (Bolour, Anderson, Dekeyser, and Wong, 1982). Since time is continuous, completeness and accuracy must be understood in a context that defines acceptable limits for information currency. It must also be accepted that absolute completeness and accuracy are impossible or impractical to achieve. As presented here, processing timeliness and information currency are really aspects of information completeness, which in turn, is an aspect of accuracy; however, because of its unique relationship to the dimension of time and the change it engenders, it is useful to identify currency/timeliness as a separate attribute of information integrity. Complete, current and timely information is critical to effective strategic management. The availability of decision critical information enables managers to analyse unusual results, propose remedial action to proactively correct any problems, and ensure that projections reflect the most current information available (Redman, 1995). Conversely, incomplete or delayed information can undermine the strategic management of an organization, and make it difficult to alter direction when necessary (Ward and Ward, 1988). There must be understood tolerance for information omissions and delays in the volume of transaction processing and the timeliness of processing. Since the tolerances for information integrity may differ among stakeholders, it may be impractical to set standards for information currency that meet the most stringent requirements. Instead, various forms of time stamping can provide useful information to enable stakeholders to assess the limitations of information integrity on this dimension. When information is enhanced by time stamping, its degree of accuracy is more understandable and more verifiable. Validity/Authorization Representational faithfulness of information about metaphysical objects implies that the information is valid in ways other than correspondence with an original physical condition. 8 of 36 The concept of validity means that information corresponds to conditions, rules or relationships approved by parties with the delegated authority to do so. Thus, valid information is authentic (from the purported source), approved and non-repudiable. Valid in accordance with specified business rules that define attributes of information and relationships among information items, governing form, content, function, time, source, and destination. Transactions are valid if they were initiated and executed by personnel or systems that have been granted the authority to do so and if approvals are authentic and within the scope of the authority granted to the approver(s). For example, if the credit limit assigned to a customer reconciles to the company’s rules and procedures used to set credit limits, the credit limit would be “valid”. Thus, the concept of validity includes elements of both accuracy and authorization. A validation process may therefore require an investigation of an individual item, a relationship between an item and another item, or a relationship between an item and a business rule, policy or standard (Agmon and Ahituv, 1987). Enablers Security Security includes physical and logical access controls to information in motion and at rest to protect the integrity of information against acts of nature and intentional malicious acts such as theft, misuse, unauthorized creation, viewing modification, dissemination or destruction, as well as inadvertent errors. This enabler has a direct impact on the validity/authorization aspect of information integrity as well as upon the other attributes. Available/Accessible For information to be complete, current and timely, it needs to be available and accessible to users in accordance with business specifications and to be retrievable in a usable form when required. Information that is not accessible when needed would not have any 9 of 36 practical consequences for users’ activities or decisions, except in the negative sense of limiting the quality of the information and users’ decisions based on that information (O' Reilly III, 1982). For information to be deemed accessible, users need to be able to work with the information in a way that meets their needs (O' Reilly III, 1982; Wang and Strong, 1996). Practically, this requires the use of a robust system to provide the information. Such a system needs to available when needed, enable the users to change the system (i.e., without programming changes) to meet their needs, operate efficiently and effectively, and be able to accomodate users’ expanding need for information (Halloran, Manchester, Moriarty, Riley, Rohrman and Skramstad, 1978). Enabling users to change the system can be very empowering for them; however, it can also be a double-edged sword. While it gives "power users" the ability to improve the effectiveness and efficiency of their work, it also allows users with lesser abilities to mire themselves. Oftentimes, as result of such "miring", the system may be perceived as being of lower quality, since users may not get the quality of information they are looking for, oblivious to the fact that it is self-inflicted. Understandable/Granularity/Aggregation The level of aggregation (granularity) of the information will affect its understandability, hence, its usefulness for controlling information integrity. For some purposes, highly aggregated information may be called for; whereas for other purposes, very detailed information may be required. Thus, appropriately tailored levels of granularity/aggregation can be enablers of information integrity. A proxy for the understandability of information is its conformity with specified user requirements. Consistency/Comparability/Standards Information is consistent/comparable if it follows the same rules (documented in system and user specifications) over time and across systems, including response and delay 10 of 36 periods. Approvals are in accordance with the entity’s policy, legal and regulatory requirements. Legal and regulatory requirements may specify the amount/extent of information, its degree of precision, degree of currency and other attributes such as consent to store or share the information if it is personally identifiable information. Consistency in how information is measured and presented or displayed to decision makers (Kahn, Strong and Wang, 2002) facilitates information dependability and predictability. Consistency, in turn, is enhanced by stability of the measurement and presentation rules over time or space. Such rules represent standards against which information measurement and presentation can be compared and assessed. Environmental uncertainties perturb information systems and can lead to changes that can adversely affect stability and consistency and, hence, their dependability. Examples of such environmental factors include complexity (e.g., a system incorporates the use of new interfaces with external entities), change (e.g., regulatory changes), devices and computer crime (e.g., hacking) (Nayar, 1996). For information to be trusted, there must be controls to ensure that it is safeguarded against forgery or tampering by unauthorized parties (Winter and Huber, 2000). Dependability/Predictability Several similar, but not identical, characteristics are grouped together under this heading, including: dependability, repeatability; stability and predictability. Since information is the result of an information process or system, the relibility of information depends on the reliability of the processes that produce it, including the reliability of embedded change management processes (Halloran, Manchester, Moriarty, Riley, Rohrman and Skramstad, 1978). It is important to note that these characteristics relate to the reliability of the information rather than the events that the information is about. Events may be inherently unpredictable, but the information about them need not be. For example, a baseball player may not hit a home run each time at bat because athletic performance is unpredictable, but 11 of 36 the information about the baseball player’s performance may be virtually 100% reliable because reliable processes are used to gather and record it. Dependability/predictability is enhanced by low levels of information omission risk; information delay risk; information error risk; information tampering/repudiation risk, although it is important to recognize that some types of environmental uncertainty that may contribute to information risk are difficult or impossible to eliminate. Verifiability/Auditability Verifiability is the ability for independent observers, applying the same processes and tolerances for completeness, currency and accuracy that are used to produce the information, to replicate substantially the same result. An audit trail enables tracing of inputs to outputs to ensure all are processed or vouching records to source inputs and recomputation of amounts or aggregates and tracing to authorization tables or signatures to ensure compliance with delegated authorities. Time stamping enables determination of currency of information. In order to verify and communicate information integrity to parties external to the information process, the various components of integrity need to be neutral, objective and measurable. This implies an approved or agreed upon set of processes or measurement rules, otherwise it would be difficult to obtain the measurement consensus that verifiability requires. In a business context, the approved set of processes or measurement rules springs from Board-approved policies and standards and any applicable legal and regulatory requirements. Among other things, these must define the degree of precision or tolerable error for the information integrity attributes of completeness, currency and accuracy. Auditability features make it possible to trace information back to its source and confirm that it is complete, current, accurate, and authorized. Key auditability features include unique transaction/record identifiers such as a unique document or transaction 12 of 36 identification number, creation date and modification date time stamps, a record of the document or transaction source and collection method, record retention and archiving, accessibility information and unambiguous and clearly documented re-computation rules (Winter and Huber, 2000). Credibility/Assurance The intangibility of information may limit the ability of users to assess information to determine whether or not it has integrity (Wang, Reddy and Amar, 1993; Richters and Dvorak, 1988). As a result, the integrity of the information may need to be independently verified or assured to be considered trustworthy and credible. The Importance of Context As will be discussed further in this report, information integrity attributes must be considered in the context of the stakeholders’ specific requirements related to the information and the recognition that perfect information integrity is not achievable because completeness, currency, accuracy, and authorization are affected by delays in data recognition, processing and utilization that, however small, introduce a degree of information impairment into all information processing functions. Thus, the standard for information integrity is not 100% representational faithfulness, but rather, representational faithfulness within accepted tolerances established in consultation with users of the information, parties responsible for maintainting the integrity of the information and assurance providers who are charged with confirming the integrity of information. The tolerances or materiality guidelines that are established must take into account the sensitivity of the information and the requirements of the user decision-models that are served by the information. 13 of 36 Survey Since the core concepts described in the previous section were derived from a review of the professional literature, a survey was conducted to validate the importance of these concepts. The concepts were presented to two groups of professionals using a workshop format for discussion and comment, but first, they were asked to complete a survey to gather the following information: 1. Relative Importance of Information Integrity Attributes and Enablers 2. Definition of Information Integrity 3. Definition of Core Attributes of Information Integrity 4. Relationship Between Information Integrity Attributes and Enablers 5. Experience with Information Integrity Impairments for Selected Industries 6. Experience with Information Integrity Impairments for Selected Data Streams 7. Information Integrity Impairments by Stages of Processing 8. Information Integrity Impairments by phases of the System Acquisition/Development Life Cycle 9. Information Integrity Impairments by System Component The following process was used to gather this information. Participants Workshop participants were volunteers who responded to announcements distributed electronically by the Toronto and Chicago chapters of ISACA. Workshop participants were about equally drawn from Toronto and Chicago. The participants were experienced professionals with an average of 17 years of works experience and an average of 5 years in their current position. About 2/3 of the participants were male and 1/3 were female. The organizations represented were predominantly small to medium size entities; 3/4 had less than 10,000 employees. The most represented sector was the financial services sector, followed by consulting and healthcare. Information systems was the largest area represented, followed by audit. Almost half of the participants also had a formal information 14 of 36 systems education as represented by their undergraduate degrees, followed by accounting and other management related fields. As might be expected, half of the participants possessed a CISA certificate, often in combination with other professional certifications. A demographic summary of the participants is provided in Table 1. All in all, the demographics for the participants indicated an experienced and knowledgeable group of professionals whose views about the attributes and enablers presented in this report should be carefully considered. Format of the Workshops Pre-reading material was distributed as advance reading. When participants arrived at the workshop, a brief 30-minute summary of this material was provided and questions about the purpose and structure of the workshop were answered. This took a further 30 minutes. Then the participants were asked to complete the survey instrument. This task was done individually by each participant and took about 60 minutes to complete. After the survey was completed, the data were transcribed into a spreadsheet and displayed to all participants and a discussion ensued. Generally, the discussion centered around similarities and differences in patterns of responses by workshop participants to identify issues or problems with the concepts. Responses to the survey were not changed except in one or two instances when errors were identified. Comments and observations made by the participants were captured by the researchers and are reported in this section of the report. This part of the workshop took approximately 60 minutes. 15 of 36 Table 1 Summary of Workshop Participant Demographics City Toronto – April 8, 2003 Chicago – June 23, 2003 Area Information Systems IS/Management/Accounting/ Production Audit Accounting/Finance Sales/Marketing/Other Number of Employees in your Firm 1-100 100-1 000 1 000-10 000 10 000-50 000 50 000-100 000 Industry Financial Services Consulting Health care Energy Public sector Telecommunications Gender Male Female College Major Information Systems IS Management IS Finance Accounting Economics Management/Finance Statistics/Other 15 13 28 11 4 9 2 2 28 Graduate Degree None MBA MS MBA plus MSMIS 5 7 10 6 2 28 Professional Certificate None CISSP CISA only CISA plus CA, CMA, CGA, CPA, CIA 11 8 4 2 2 1 28 19 9 28 16 of 36 9 2 1 8 2 3 2 28 21 5 1 1 28 11 2 3 12 28 After a lunch break, the participants were divided into groups based on the transaction streams that they had self-selected when they were completing the survey section addressing the participants’ experience with information integrity impairments. This part of the workshop took 3 hours in total and consisted of several sessions devoted to group discussion and information sharing by all the groups. The content of the discussions in this latter part of the workshop are reported separately in Boritz (2003c). Upon completion of this part of the workshop, participants were thanked for their contribution and the workshop ended. Summary of Findings 1. Relative Importance of Information Integrity Attributes and Enablers The participants were asked to consider the following information integrity attributes and enablers and rank them in importance from 0 (completely unimportant or irrelevant) to 10 (absolutely essential). Subsequently, they were asked to identify a data stream with which they had personal experience and rate the severity of observed information integrity impairments where 0 represents not experienced and 10 represents extremely serious impairments exceeding 1% of gross revenues. As shown in Table 2, the concepts identified had high to very high ratings, indicating broad support for the framework components. All of the attributes and enablers were significantly different (p< .05) from 5, the midpoint of the scale, which could be considered to represent a neutral degree of importance. Interestingly, the enablers as a whole were rated lower than the primary information integrity attributes. This finding is particularly noticeable when a combined importance scale is created by multiplying importance by severity of observed impairments. It is interesting that the combined importance rating highlight the importance of validity/authorization. 17 of 36 Table 2 Importance of Information Integrity Attributes and Enablers Concept Importance 0 = irrelevant 10 = absolutely essential N=28 Overall Severity of Observed Impairments N=28 Combined Score: (Importance * Severity) N=28 Rank Order Completeness Currency Timeliness Accuracy Correctness Validity Authorization 9.4 8.9 9.0 9.8 9.6 9.2 8.8 7.4 6.5 6.8 6.0 4.6 7.8 7.7 69.6 57.9 61.2 58.8 44.2 71.8 67.8 2 6 4 5 7 1 3 Security Availability Accessibility Understandability Granularity Aggregation Consistency Standards Dependability Predictability Verifiability Auditability Credibility 8.8 8.5 8.3 8.6 7.5 7.3 8.3 7.9 8.9 7.9 8.9 8.5 8.4 4.6 6.3 5.7 3.5 3.1 3.8 5.4 4.7 5.0 4.4 5.6 5.8 4.3 40.5 53.6 47.3 30.1 23.3 27.7 44.8 37.1 44.5 34.8 49.8 49.3 36.1 7 1 4 11 13 12 5 8 6 10 2 3 9 N=28 Comments related to these items included the following: Some participants questioned the inclusion of useability factors such as understandability and availability/accessibility in an information integrity model. Others agreed with including these concepts because they reflected the user dimension within the information integrity framework. Some participants questioned whether enablers such as granularity and aggregation related to information usefulness rather than integrity, and this is reflected in the comparatively lower ratings that these two items received. The response to this comment is that inappropriate granularity and aggregation, in addition to affecting usefulness of information, could 18 of 36 detrimentally affect the functioning of decision-making and control processes, thereby affecting information integrity as well. Some participants questioned the value of the enabler “standards,” while others defended its inclusion. There seemed to be a consensus that perhaps the enabler should be described as “enforced standards,” since a number of participants questioned the value of standards if they weren’t enforced. Other participants observed that the mere existence of standards should improve information integrity as compared to the situation where there are no standards, even if there was no formal enforcement system. Some participants questioned whether predictability should be an enabler since it represented an inherent attribute of the information rather than an actionable item. The response to this comment is that enablers can be inherent properties of the information. An enabler does not need to be an actionable item. 2. Definition of Information Integrity Information integrity was defined as the representational faithfulness of the information to the condition or subject matter being represented by the information. About 75% (22 of 28) participants agreed with this definition. Those that did not agree had the following comments: It is too close to the FASB conceptual framework. Faithfulness is a value-loaded term that can be interpreted subjectively; why not simply use the attributes to define information integrity rather than an overall term such as “representational faithfulness.” The response to this comment is that integrity has come to have many meanings in common usage, and the term is often associated with honesty and truthfulness. “Representational faithfulness” reflects the dictionary meaning of integrity. There should be some qualification such as materiality or mention of a context when presenting the framework. Only completeness and accuracy should be in this definition because they are less subjective than the other comments. The response to this comment is that many practitioners may not see the time dimension that is implicit in completeness. Currency/timeliness of information is a very significant issue affecting the representational faithfulness of information so it needs to be reinforced. This was a major omission in the COBIT definition of information integrity which 19 of 36 led to the omission of controls oriented towards the achievement of this attribute. Also, for many items integrity can only be determined by conformity with business rules. There is no real world physical reference point for establish the representational faithfulness of much business information. For example, customer credit limits or a preferred supplier list do not describe physical realities of customers or suppliers; they represent business rules that have integrity if they are authorized and otherwise do not. 3. Core Attributes of Information Integrity Similarly, 75% (21 of 28) participants agreed with the definition of representational faithfulness using the core information attributes of completeness, currency, accuracy, and authorization. Those that did not had the following comments: It is too financially focused. There are too many attributes, only completeness and accuracy are required. There are too few attributes; some of the enablers should be included, particularly verifiable/auditable. A context is required to define these attributes or make them objectively measurable. 4. Relationship Between Information Integrity Attributes and Enablers Participants were presented with a version of Figure 2 (with all cells blank) and asked to consider the clusters of enablers listed in the columns and rate their importance to the attributes listed in the rows from 0 (completely unimportant or irrelevant) to 10 (absolutely essential). Table 3 summarizes the survey responses. 20 of 36 Credible/ Assured Verifiable/ Auditable Dependable/ Predictable Understandable/ Appropriate level of granularity/aggregation Available/ Accessible Secure Consistent/ Comparable/ Standards-based Table 3 Panel A: Hypothesized Relationship Complete Current/ Timely Authorized/ Valid Accurate/ Correct E E E E E E E E E E E E E E E E E E E E Panel B: Actual Relationship Complete Current/ Timely Authorized/ Valid Accurate/ Correct 7.1 7.9 7.1 7.9 7.5 7.9 7.8 5.5 8.8 5.2 6.2 7.3 6.5 6.7 8.7 5.1 4.1 6.8 6.6 8.5 7.8 7.5 6.3 6.3 7.9 8.4 8.7 8.9 Panel C: Relationships to Investigate Further Complete Current/ Timely Authorized/ Valid Accurate/ Correct 7.1 7.9 7.1 7.9 7.5 7.9 7.8 5.5 8.8 5.2 6.2 7.3 6.5 6.7 8.7 5.1 4.1 6.8 6.6 8.5 7.8 7.5 6.3 6.3 7.9 8.4 8.7 8.9 Note: E = Enabler of Core Attribute 21 of 36 Panel A summarizes the anticipated relationship between the primary attributes and enablers as discussed earlier. Panel B summarizes the averages of the actual responses provided by the workshop participants. Panel C highlights unexpected relationships that were identified by the workshop participants. The pattern of participants’ responses is broadly supportive of the anticipated relationship between the primary attributes and enablers. Panel C indicates that Security was judged to have an impact on each of the four core attributes, although the strongest relationhsip was the hypothesized relationship between security and authorization/validity. Also, the relationship between understandability/aggregation/granularity was stronger than expected for completeness and weaker than expected for accuracy/correctness. These findings may need follow-up to determine whether the hypothesized relationships need to be revised or whether a better match between Panel A and Panel B could be obtained through training or increased familiarity with the terms developed through practical application and experience. 5. Information Integrity Impairments by Industry Participants fell into four industry groups: financial services, consulting, health care and other. Table 4 provides a breakdown of the overall severity of impairments relative to specific information integrity attributes and enablers by industry group. 22 of 36 Table 4 Information Integrity Impairments by Industry Concept Importance 0= irrelevant 10 = absolutely essential Overall Severity of Impair ments N=28 Severity by Industry: Severity by Industry: Severity by Industry: Severity by Industry: Financial Services Consulting Health Care Other N=11 N=8 N=4 N=5 Completeness Currency Timeliness Validity Authorization Accuracy Correctness 9.4 8.9 9.0 9.2 8.8 9.8 9.6 7.4 6.5 6.8 6.0 4.6 7.8 7.7 7.0 6.9 5.9 6.0 3.7 7.3 3.5 8.5 5.6 7.9 7.1 5.3 8.9 0.0 7.5 8.8 8.5 6.5 4.5 7.8 7.3 6.6 5.4 5.6 3.8 5.6 7.0 6.6 Security Availability Accessibililty Granularity Aggregation Understandability Consistency Standards Dependability Predictability Verifiability Auditability Credibility 8.8 8.5 8.3 7.5 7.3 8.6 8.3 7.9 8.9 7.9 8.9 8.5 8.4 4.6 6.3 5.7 3.5 3.1 3.8 5.4 4.7 5.0 4.4 5.6 5.8 4.3 3.8 5.6 4.9 2.3 2.2 2.0 5.2 3.8 3.4 3.3 4.4 4.8 3.5 5.0 7.1 6.8 4.5 3.1 4.9 5.6 3.9 4.6 4.4 5.0 5.3 4.4 6.3 7.8 7.0 4.5 4.3 5.8 6.0 5.5 7.8 5.8 6.3 5.5 6.5 4.4 5.0 4.4 3.6 3.8 4.2 5.2 7.0 6.2 5.2 8.6 8.8 3.8 It is interesting to note that there were no significant industry differences identified in the overall importance or severity rankings by industry (or the combination of the two ratings) except for the importance of accuracy, being significantly higher (p<.05) for participants in consulting, financial services and health care than those in the “other” category. It is also noteworthy, however, that a comparison of overall importance with severity of ratings by industry yielded significant differences between the two sets of ratings for every attribute and enabler under financial services. 23 of 36 Generally speaking, the observed impairment ratings were significantly lower than the importance ratings. 6. Information Integrity Impairments by Data Stream As mentioned previously, participants were asked to identify a data stream with which they had personal experience and relate the information integrity impairments to that stream. Participants were asked to rate the severity of the impairments where 0 represents not experienced and 10 represents extremely serious impairments exceeding 1% of gross revenues. Table 4 summarizes these ratings by data stream. The importance ratings from Table 2 are reproduced for ease of reference. Participants’ choices of data stream clustered around revenues, expenditures such as claims payments, management of customer account data and event capture involving shipping of goods or provision of services. 24 of 36 Table 5 Information Integrity Impairments by Data Stream Concept Importance 0= irrelevant 10 = absolutely essential Overall Severity of Impair ments Severity by Data Stream: Severity by Data Stream: Severity by Data Stream: Severity by Data Stream: Revenues Claims Account managemt Event Capture N=28 N=9 N=6 N=10 N=3 Completeness Currency Timeliness Validity Authorization Accuracy Correctness 9.4 8.9 9.0 9.2 8.8 9.8 9.6 7.4 6.5 6.8 6.0 4.6 7.8 7.7 7.8 7.6 8.1 7.2 5.0 8.7 3.3 6.8 4.8 6.5 5.2 6.2 8.8 1.7 6.2 6.1 5.4 4.7 3.5 6.7 4.3 9.3 6.0 6.0 6.3 4.0 6.3 5.7 Security Availability Accessibililty Granularity Aggregation Understandability Consistency Standards Dependability Predictability Verifiability Auditability Credibility 8.8 8.5 8.3 7.5 7.3 8.6 8.3 7.9 8.9 7.9 8.9 8.5 8.4 4.6 6.3 5.7 3.5 3.1 3.8 5.4 4.7 5.0 4.4 5.6 5.8 4.3 5.1 6.8 6.4 4.0 3.4 4.4 6.2 5.4 6.4 5.9 6.1 6.2 3.9 6.0 6.5 6.8 3.3 3.2 3.5 6.2 4.7 3.5 2.8 6.7 7.0 5.0 4.2 5.1 4.3 3.9 3.3 4.2 5.2 3.9 4.4 3.7 5.1 5.3 4.5 1.7 6.0 3.7 0.0 0.0 0.0 2.3 3.3 2.3 2.3 3.3 3.3 1.7 It is interesting that no signficant differences were identified in importance or severity (or the combination of these two concepts) by data stream. It is also noteworthy, however, that a comparison of overall importance with severity of ratings by data stream yielded signficant differences between the two sets of ratings for every attribute and enabler in the account management stream. Generally speaking, the observed impairment ratings were significantly lower than the importance ratings. 25 of 36 7. Information Integrity Impairments by Stages of Processing Participants were asked to relate the impairments to stages of processing, on a scale from 0 to 10, where 0 is no relationship and 10 is an absolutely powerful relationship. Table 6a summarizes these ratings by stage of transaction processing. Table 6a Types of information integrity impairments experienced Input Average Severity of Impair ments Data source/ transaction initiation Transmission Communicatio ns over public/private networks Data collection, preparation and data entry Data editing and validation Processing Storage Output Updates to databases, files and tables Intermediate storage in databases or other logical storage devices Output reporting, abstraction, and summarization Logic applications, computations, and analyses Back/up and recovery, including offsite storage Use of output / Interface to other destination Completeness Currency Timeliness Validity Authorization Accuracy Correctness 7.4 6.5 6.8 6.0 4.6 7.8 7.7 8.0 5.2 5.9 6.0 5.4 7.9 7.7 5.1 3.2 5.1 3.5 2.9 4.0 2.9 6.8 4.1 4.9 4.3 3.1 6.5 6.2 3.8 2.0 2.2 2.0 1.7 2.9 1.8 6.4 4.6 4.7 3.7 3.0 6.1 4.5 Security Availability Accessibililty Granularity Aggregation Understandability Consistency Standards Dependability Predictability Verifiability Auditability Credibility 4.6 6.3 5.7 3.5 3.1 3.8 5.4 4.7 5.0 4.4 5.6 5.8 4.3 4.7 3.8 3.9 3.3 2.9 3.7 5.5 4.7 4.5 3.4 5.3 5.3 3.7 3.7 3.0 3.6 1.6 1.6 2.4 2.6 2.0 3.3 2.5 3.0 3.3 2.0 3.3 3.5 3.5 2.8 2.6 3.2 3.7 3.2 4.1 4.0 4.1 4.8 2.6 3.4 3.4 3.3 1.6 1.6 1.2 1.3 1.9 2.5 1.5 2.9 3.0 2.1 3.3 3.9 4.1 3.6 3.8 4.0 3.9 3.0 4.1 3.1 4.9 4.4 2.9 The storage and transmission phases are least associated with impairments. 26 of 36 Table 6b summarizes the differences noted between the pairs of stages of processing. An “X” in a cell indicates a significant difference (p<.05) between the means of the stage of processing for the attribute or enabler listed in the left-hand column for the pairs in the respective column. Security Availability Accessibility Granularity Aggregation Understandability Consistency Standards Dependability Predictability Verifiability Auditability Credibility X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X These findings suggest (consistent with the literature) that the input phase is a particularly significant source of impairments in core attributes and enablers. 27 of 36 Output/ Storage Process/ Output Process/ Storage Output/ Transmission X Process/ Transmission X X X X X X X Transmission/ Storage Input/ Output Completeness Currency Timeliness Validity Authorization Accuracy Correctness Input/ Storage Attribute/Enabler Input/ Process Pair of Stages of processing Input/ Transmission Table 6b X X X 8. Information Integrity Impairments by SDLC Participants were asked to relate the impairments to stages of system acquisition/development, on a scale from 0 to 10, where 0 is no relationship and 10 is an absolutely powerful relationship. Table 7a summarizes these ratings. Table 7a Types of information integrity impairments experienced Severity of Impair ments N=28 Initiate Design Initial proposal, investigation, funding approval and planning Analysis of business function and user interface requirements Initial conceptual design Overall Build Detailed design Acquisition/ development (including unit and system testing) Operate Maintain Operation Monitoring, checkpoints, feedback loops Maintenance and change management Learning and improvement, abandonment or destruction Implementatio n, deployment, acceptance testing, conversion Completeness Currency Timeliness Validity Authorization Accuracy Correctness 7.4 6.5 6.8 6.0 4.6 7.8 7.7 3.0 1.7 1.6 2.5 2.5 3.2 2.9 4.6 2.4 2.6 3.4 3.4 5.7 5.8 4.8 2.8 2.8 3.9 3.8 5.7 6.5 4.6 3.7 4.1 3.8 3.5 5.5 6.0 5.0 3.2 3.5 3.9 3.6 5.7 5.5 Security Availability Accessibililty Granularity Aggregation Understandability Consistency Standards Dependability Predictability Verifiability Auditability Credibility 4.6 6.3 5.7 3.5 3.1 3.8 5.4 4.7 5.0 4.4 5.6 5.8 4.3 1.9 1.5 1.3 1.4 1.1 1.3 1.9 2.4 1.9 1.2 1.8 2.3 1.8 3.8 2.6 2.9 3.7 2.9 2.6 2.5 3.4 2.5 2.2 3.2 3.4 2.5 3.2 2.1 2.2 2.4 2.4 2.0 3.1 3.6 3.0 2.6 3.6 3.4 2.8 3.8 4.6 3.4 3.0 2.9 2.4 4.2 4.5 4.3 3.4 3.9 4.6 3.9 3.5 3.3 3.1 2.3 2.4 2.4 3.5 3.7 3.0 2.3 3.5 3.6 3.2 28 of 36 Table 7b summarizes the differences noted between the pairs of stages of the SDLC. An “X” in a cell indicates a significant difference (p<.05) between the means of the stage of processing for the attribute or enabler listed in the left-hand column for the pairs in the respective column. Security Availability Accessibility Granularity Aggregation Understandability Consistency Standards Dependability Predictability Verifiability Auditability Credibility X X X X X X X X X X X X X X X X X X X X X Maintain/ Operate Build/ Operate Maintain/ Design Design/ Operate Build/ Design Build/ Maintain X X X X X Maintain /Initiate Operate/ Initiate Attribute/Enabler Completeness Currency Timeliness Validity Authorization Accuracy Correctness Build/ Initiate Pairs of Steps in SDLC of processing Design/ Initiate/ Table 7b X X X X X X X X X X X X X X X X X X X X X These findings suggest that the initiation phase is quite different from the other phases of the SDLC; i.e., it is least likely to be associated with severe impairments in core attributes or enablers. The operate phase is a significant source of impairments associated with enablers. Interestingly, the maintenance phase does not appear to be an unusually important source of severe impairments. 29 of 36 9. Information Integrity Impairments by System Component Participants were asked to relate the impairments to system components, on a scale from 0 to 10, where 0 is no relationship and 10 is an absolutely powerful relationship. Table 8a summarizes these ratings. Completeness Currency Timeliness Validity Authorization Accuracy Correctness Security Availability Accessibililty Granularity Aggregation Understandability Consistency Standards Dependability Predictability Verifiability Auditability Credibility Overall 7.4 6.5 6.8 6.0 4.6 7.8 7.7 4.6 6.3 5.7 3.5 3.1 3.8 5.4 4.7 5.0 4.4 5.6 5.8 4.3 Data Procedures N=28 Human infrastructure Severity of Impair ments Software Type of information integrity impairments experienced IT Infrastructure Table 8a 3.4 2.3 2.8 1.8 2.2 2.7 3.0 4.5 2.5 2.7 3.4 2.7 5.1 6.0 4.8 3.5 4.1 2.9 3.4 5.6 4.8 6.0 3.5 3.5 3.5 3.9 5.1 4.8 5.7 3.3 2.9 4.4 2.0 5.5 4.5 3.0 3.8 3.0 0.7 1.4 0.9 1.8 3.0 2.6 1.5 1.5 3.2 1.6 3.8 4.0 3.3 2.4 2.5 2.9 3.8 3.9 3.3 2.6 3.6 4.1 2.0 2.6 2.5 2.0 2.1 1.5 3.8 3.5 2.9 3.8 2.8 3.0 3.4 2.9 3.2 2.1 2.5 2.6 2.1 5.6 2.8 4.3 3.0 2.3 3.3 3.9 2.7 3.3 2.8 2.0 2.4 2.5 2.6 2.3 3.8 3.8 2.5 3.2 3.7 2.4 30 of 36 Table 8b summarizes the differences noted between the pairs of system components. An “X” in a cell indicates a significant difference (p<.05) between the means of the stage of processing for the attribute or enabler listed in the left-hand column for the pairs in the respective column. X Security Availability Accessibility Granularity Aggregation Understandability Consistency Standards Dependability Predictability Verifiability Auditability Credibility X X Procedures/ Data X X X Data/ Software Validity Authorization Accuracy Correctness Procedure/ Software X Data/ Human Infrastructure X Procedures/ Human Infrastructure Data/ IT Infrastructure Completeness Currency Timeliness Attribute/Enabler Human Infrastructure/ Software Procedures/ IT Infrastructure Human Infrastructure/ IT Infrastructure Pairs of Steps in SDLC of processing Software/ IT Infrastructure Table 8b X X X X X X X X X X X X X X X X X X X X X X X X X These findings suggest that IT infrastructure is quite different from human infrastructure, software, procedures and data; i.e., it is least likely to be associated with severe impairments in core attributes or enablers. Understandability is an important differentiator between most components. 31 of 36 Limitations of this Study and Concluding Remarks A study such as this one has a number of limitations which should be considered when interpreting the results. The participants were self-selected and may not be representative of the practitioner community. Hence, their ratings of attribute/enabler importance and observed information integrity impairments may not be generalizeable. Also, the participants may have been biased to endorsing the concepts included in the pre-readings due to the research support provide by ISACA for this project. Also, the participants interacted with the author of the pre-reading materials during the workshop and may have been inclined towards being supportive rather than critical or challenging. On the other hand, the participants were experienced and qualified in IS assurance-related considerations. They had no incentives to support ideas that they did not agree with and were encouraged to be critical in the workshop sessions. With these qualifications, the results provide a number of interesting perspectives on information integrity as summarized below. This study posits that information integrity is synonymous with its representational faithfulness. Representational faithfulness is exhibited when information is: complete (within limits established by agreement, policy or regulation), current/timely (within limits established by agreement, policy or regulation), authorized/valid (in accordance with policies, standards and “business rules” established by top management and the Board and applicable laws and regulations established by regulatory agencies or legislative bodies) and accurate/correct (within limits established by agreement, policy or regulation). In addition, the study finds support for a second layer of attributes represented by the following “enablers” for the core attributes of information integrity: 32 of 36 Secure Available/ Accessible Understandable/ Appropriate level of granularity/aggregation Consistent/ Comparable/ Standards Dependable/ Predictable Verifiable/ Auditable Credible/ Assured This definition is broader that that provided in COBIT (ISACA, 2000) which is a widely recognized international control guideline, but narrower than the concepts of information quality discussed in the literature. Also, the definition of information integrity given here is a broader concept than data integrity, since data is commonly considered to be a “raw material” that is used to create an information product that is a “finished product” ready for use by an internal user such as an employee or manager or external user such as a customer, supplier, analyst or regulator. Thus, one would expect a discussion of the attributes of information integrity to be somewhat broader than a discussion of data integrity and to consider the users of the information products. Information integrity is related to, but not guaranteed by, system integrity. The survey results support the broader definion of information integrity compared with the one in COBIT. The survey results provide strong support for both the core attributes and the enablers. For example, currency, timeliness, authorization and security are not included in the COBIT definition of information integrity, although these concepts are included in other COBIT information criteria. Also, several enablers are not explicitly considered by COBIT in connection with information integrity criteria. COBIT is a global standard that is intended for widespread use for internal and external assurance on information technology controls. However, one of the policy recommendations arising from the findings of this study is that the COBIT definition of information integrity be reconsidered. Also, a two-layer framework of core attributes and enablers should be considered. Interestingly, data stream and industry were not associated with significant differences in respondents’ observed impairment severity. However, phases of transaction processing, stages of system acquisition and development and system components were associated with impairments in 33 of 36 significant ways. These findings suggest that measures aimed at improving information integrity must differentiate amongst these factors. For example, controls should consider the high risk associated with input, the moderate risk associated with transmission, processing and output and the low risk associated with the storage phase of transaction processing. Similarly, control strategies should consider the moderate risk of most stages and the low risk of the initiation phase. Finally, control strategies should consider the moderate risks associated with most of the system components and the low risk associated with IT infrastructure. 34 of 36 References Agmon, Nachman and Niv Ahituv, Assessing Data Reliability in an Information System, Journal of Management Information Systems, Vol. 4, No. 2, Fall, 1987, pp. 34-44. Betts, M., “Dirty Data: Inaccurate data can ruin supply chains,” Computerworld, December, 2001, p. 42. Bolour, A., T.L. Anderson, L.J. Dekeyser and H.K.T. Wong, “The Role of Time in Information Processing: A Survey,” SIGMOD, Record Vol.12, No. 3, 1982, pp. 27-50. CICA (Canadian Institute of Chartered Accountants), 20 Questions Directors Should Ask About IT, 2002. Davis, Fred D., “Perceived Usefulness, Perceived Ease of Use, and User Acceptance of Information Technology,” MIS Quarterly, 1989, pp.318-336. English, Larry P., “Improving Data Warehouse and Business Information Quality: Methods for Reducing Costs and Increasing Profits,” John Wiley & Sons, Inc., 1999, p.147. FASB (Financial Accounting Standards Board) Statement of Financial Accounting Concepts No. 2: Qualitative Characteristics of Accounting Information, May 1980. Halloran, Dennis; Susan Manchester, John Moriarty, Robert Riley, James Rohrman and Thomas Skramstad, “Systems Development Quality Control,” MIS Quarterly, 1978, pp. 1-13. Hamaker, S. “Spotlight on Governance,” Information Systems Control Journal, Vol. 1, 2003, pp. 15-19. ISACA (Information Systems Audit and Control Association) COBIT (Control Objectives for Information Technology) 3rd edition, 2000. ITGI (IT Governance Institute) IT Governance Executive Summary; Board Briefing on IT Governance, 2001. Kahn, Beverly K., Diane M. Strong, and Richard Y. Wang, “Information Quality Benchmarks: Product and Service Performance,” Communications of the ACM,Vol. 45, No. 4, 2002, pp. 184192. Kaplan, R. and D. Norton, The Balanced Scorecard: Translating Strategy into Action, Harvard Business School Press, Boston, 1996. Nayar, Madhavan K., “A framework for achieving Information Integrity,” IS Audit & Control Journal, Vol. 2, 1996, pp. 30-34. 35 of 36 O' Reilly III, Charles A., “Variations in Decision Makers’ Use of Information Sources: The Impact of Quality and Accessibility of Information,” Academy of Management Journal, Vol. 25, No. 4, 1982, pp. 756-771. PricewaterhouseCoopers, Global Data Management Survey, 2001. Redman, Thomas C., “Improve Data Quality for Competitive Advantage,” Sloan Management Review, 1995, pp. 99-107. Redman, T., “The impact of poor data quality on the typical enterprise,” Communications of the ACM, Vol. 41, No. 2, February, 1998, pp. 79-82. Richters, John S., and Charles A. Dvorak, “A Framework for Defining the Quality of Communications Services,” IEEE Communications Magazine, 1988, pp. 17-23. Saull, R., “The IT Balanced Scorecard – A Roadmap to Effective Governance of a Shared Services IT Organization.” Information Systems Control Journal, Vol. 2, 2000, pp. 31-38. Wang, R. Y., V.C. Storey and C.P. Firth, “A Framework for Analysis of Data Quality Research,” IEEE Transactions on Knowledge and Data Engineering, Vol. 7, No. 4, August, 1995, pp. 623640. Wang R. Y. and D. M. Strong, “Beyond Accuracy: What Data Quality Means to Data Consumers,” Journal of Management Information Systems, Vol. 12, No. 4, Spring, 1996, pp. 5-34. Wang, Richard Y.; M.P. Reddy and Amar Gupta, “An object-oriented Implementation of Quality Data Products,” Workshop on Information Technologies & Systems, 1993, pp. 48-56. Ward, Keith R. and John M.Ward, “Financial Information System: Asset or Liability,” Management Accounting, January, 1988, pp. 32-37. Webster’s Third New International Dictionary. Winter, Wolfgang and Ludwig Huber, “Part 3: Ensuring Data Integrity in Electronic Records,” BioPharm, 2000, pp. 24-27, 38. 36 of 36