White Paper Excel Risks and Controls

advertisement
White Paper - Excel Risks and Controls
Chris Mishler, CMA, CIA, CISA
Subject Matter Expert, User-Developed Application Risks
Experis Finance
November, 2014
Abstract
Excel is a powerful and popular application for business that can be even more effective with the
consideration of the risks that come along for the ride and their corresponding controls. We will
enumerate various risks and some methods to controls them in this white paper to enable users to
create necessary solutions less likely to cause material adverse impacts on their organizations. It is
possible to create excellent results without increasing the probability of serious errors with some basic
education on design techniques and built-in Excel features. It helps enormously to know better practices
for both the efficiency and effectiveness of these “user-developed applications” (UDAs), those created
without benefit of IT application development, typically referred to as Software Development Lifecycle
(SDLC), which include gathering user and functional requirements, testing, review and approval, and
typically involve development management tools.
Risk
A common definition of risk is the probability of an adverse effect times the magnitude of the impact.
Although some enormous numbers flow through many financial reporting supply chain spreadsheets,
the risk of material errors could be more dependent on the sheer number of risky practices engaged in
by Excel solution developers and their users. It would not be uncommon to find large workbooks with
multiple worksheets and even millions of occupied cells. The visibility and importance of key numbers in
revenues, profits, assets and liabilities increases the likelihood that they will draw the attention of
reviewers. More difficult to discern, the existence and impact of subtle or low-key but pervasive
behaviors that could result in adverse outcomes might play a bigger role in the production of higher risk
in user-developed applications.
A common scenario of uncontrolled risk involves large and complex financial workbooks. Where does
one begin to assess the risk embedded in these files? Since prevention is an efficient way of reducing
risk, the design of the workbook, as far as the data flow and the segregation of same data types
(modular design) is a good overall starting point. A helpful maxim in risk reduction is to make the
solution “as complex as necessary and as simple as possible.” Another rule of thumb in Excel solution
design is to emulate a database as much as possible in the data and calculation areas. That is, each
column would be a field and each row a record. The implication of this approach is to use each column
for one purpose or data type. All too often, this potential guideline is unknown or underemployed,
consequently raising risk levels.
We combination developer/users may have been “home-schooled” or self-taught in Excel functions, but
formal or informal education in risks in UDAs is difficult to find. No surprise then that many of us engage
in “organic programming,” so to speak, in which the developer starts with a data set and an end in mind,
or possibly only partial requirements for a desired report or outcome. The user starts to put together
formulas for data that has been arranged somehow to lead to the kind of final reporting needed.
Controls and Good Practices
There are at least as many control techniques as there are risk types. The question is which controls best
fit the given situation. The basic considerations are when to use preventive and detective controls. Since
it is well-known that preventive measures are more efficient than detective ones, focusing on these first
will yield the best results. The types of preventive controls to consider include planning, design, testing
and even review by other experts. The border between preventive and detective controls may be fuzzy
until an application (workbook) is launched into production, the point at which a workbook is used to
create the intended outcomes in the financial, analytical, regulatory or operational arenas. While userdeveloped applications are by definition not run through the typical SDLC, some of these practices can
be simulated to improve the quality of the production outcome.
Pre-Production
There are many benefits to imposing the discipline of planning the creation of a workbook solution
before starting to put together the requisite components. It may be as simple as putting pencil to paper
to sketch out the elements for a new process or redesigning an existing one. Gather all the user and
functional requirements along with all documentation on the process to be reflected in the Excel
application. Interviews with the ultimate users, if not oneself, will be of great benefit to the
understanding of the need. While it is possible to encounter the “paralysis by analysis” bottleneck,
anecdotally this situation is unlikely, due to the deadlines typically bearing on accountants, developers
or other users.
Modular Design
The other key aspect in planning a spreadsheet solution is the design thereof. Contrasting with the “gowith-the flow” programming philosophy, preference for the modular design approach will pay dividends
during implementation and during continued use through several benefits such as reducing risk through
improved data flow and integrity, easing review and approval of calculations, setting up reports that do
not have to change when inputs change, identifying assumptions and enhanced documentation and
stronger controls. This concept is easily grasped from a diagram:
The idea we need to employ is partly a kind of control itself – segregation of duties, achieved by
grouping like data types and functions in separate worksheets (pages or tabs). In smaller applications,
these can be on the same tab, but then formatting through the use of colors and borders are needed to
delineate these separate data types. The saying, “Everything in its place and a place for everything,”
takes on special impact in good spreadsheet design.
The typical data types (counting calculations as data):






Data inputs (typically periodically updated values used in calculations)
Assumptions (infrequently updated universal values used throughout the application)
Calculations or formulas, linked to input data
Documentation such as standard work instructions, input sources, and an overview of the
process
Controls or validation checks
Outputs (reports, journal entries, etc.)
There are other options, such as navigational aids like a table of contents, or tabs that are section
headers but these are more necessary or desirable to the extent that an application increases in size and
complexity.
Color. Another option to enhance spreadsheets is the appropriate use of color to draw the eye to the
various distinct purposes of each area in a file. Conditional formatting often involves color too and can
be quite beneficial in highlighting exceptions. Typical of medium to large files is the coloring of the tabs,
as opposed to the actual contents. Good rules of thumb (but consistency is key):


GREEN – Ideal for input values. Green for “go” here to enter the process inputs. Typically these
ranges are unlocked.
– Fits the need for values or other input which changes infrequently, but is still key,
such as interest rates or periods (dates). Signifies caution, check whether the values need to be



updated. These cells or name ranges may be unlocked for convenience, unless they rarely
change.
RED – Use for calculation sections (implies, “stop,” do not change without review and approval).
Lock these cells and turn on protection for the worksheet after the formulas have been tested,
reviewed approved to prevent unintended or other kinds of unapproved changes.
BLUE – Typical for identifying linked cells.
Other colors are useful for other purposes, such as identifying documentation, or error
conditions, but the main point is to be consistent and explain the use of colors within the
documentation, if not obvious. Another hint – Moderation in all things (don’t overdo it on
colors, as they can also distract users).
While the benefits were touched upon above, additional pluses for this approach include:




Efficiency -Data is identified and entered once and used multiple times. This approach reduces
re-keying, thereby reducing the possibility of entry errors. It is also easier to debug or trace
errors if data is located in one place.
Consistent quality of production – calculations being located in the same place allow easier
checking and consistency of the formulas. Inconsistent formulas are a constant threat of error.
Control – It is possible and even common for users to introduce checksums, input controls and
similar tests throughout a large spreadsheet. It is better to move these controls to or create
them in a single tab, including an overall control status cell which verifies the condition of all the
controls as TRUE or FALSE, ideally with conditional formatting to highlight undesirable control
status of the overall workbook application. As a bonus feature the overall status as a picture
linked to wherever the majority of the spreadsheet’s work is being done can be a real plus.
Ease of Navigation – Consistent existence of typical model elements and the order of their
appearance makes it easier for users to follow the data flow and apply the process steps. Save
time not wasted in poking around to find the input areas for example, which may seem arbitrary
to non-developers.
Review the File
As obvious as it seems, bad spreadsheets can be caught red-handed and good spreadsheets can become
better through a careful review by the developer. Once the major requirements, functions and
processes have been compared to specifications (assuming they exist), ask an Excel expert to check it as
well. In larger applications, this kind of review is impractical in any mode other than random-like
sampling or spot-checking. In these cases, an investment in a spreadsheet diagnostic tool (see Appendix
B) will be a wise one, worth the money. For the sheer heavy lifting and peace of mind they provide, the
price will seem quite reasonable. There is a science to the use of these programs, since they tend to
produce numerous false positives in certain tests (such as the search for “constants in formulas” which
see the lookup column in a VLOOKUP function as a potential risk). There is some customizability of tests
and built-in filtering of results that can overcome these situations.
Checking the logical flow of a file is enabled by having documentation to use as a source for how the file
is supposed to work. This kind of review is somewhat pointless without the baseline expectation of the
various subprocesses represented by formula regions. The mathematical integrity of the various
formulas is an easier chore. The trick is to substitute values for data which would produce outcomes
easily verifiable against expected computations. Switch out existing numbers for zeroes or ones.
Summary formulas should be easy to check. Unexpected results could signal inconsistent formulas or
overriding of formulas with hardcoded values.
Excel does provide a certain level of error checking, so it is a good quick check or complement to the
automated diagnostic software. Hints about underlying issues are strewn about many workbooks in the
form of little green triangles in the corner of a cell, with an exclamation mark call-out signaling some
internal conflict with the rules. Instead of ignoring them, see if there is a good reason for the alert. If the
answer is, “Yes, I know, that is okay,” then take the time to highlight the range in question and selecting
“Ignore Error” from the smart icon (the callout) so that the alerts go away. This habit of confronting the
error alerts precludes the tendency to ignore these warnings, which sometimes might actually be
helpful.
The other resource embedded in Excel is the Error Checking menu item under the Formulas>Formula
Auditing ribbon.
Database Mentality
Making an Excel application look and act like a database (such as in Microsoft Access) has additional
benefits in risk mitigation. Some will object that Excel is not a database and thinking so will cause
misalignment of the program’s strengths and outcomes. Perhaps so, but within limitations, knowing
some database principles and applying them as much as reasonably possible can have a positive impact
on risk levels. In general, and especially for data inputs and calculations, think of columns as fields and
rows as records. Some of these database preferences or features and their Excel implications:
1. Avoid blanks. Like Nature, databases abhor vacuums. One reason users build in numerous blank
cell references, for example, is formatting output in a report. In turn, this perceived need stems
from missing the modular design clue bus. We like a certain amount of white space around key
output values, which is understandable, but if we adhere to the modular design concept, we can
add all the white space that delights the eye without introducing risk by mixing data types and
functions such as input, calculations and outputs.
2. Data type consistency. Anyone who has tried to upload an Excel page into Access will attest to
the fact that mixing data types or even formats will lead to rejection of records in an upload.
There is a good reason for this in that functions relying on given fields expect a consistent data
type. A number field should not at the same time be formatted as text and vice versa. Dates
should be in date format all the way down a column, and so forth. This database principle
reinforces the modular design and the reverse is also true.
3. Process-flow enhancement. Databases consist primarily of data tables, queries to select or act
on certain data sets and corresponding reports. The clear separation of database functions
reinforces the modular and step-wise design of a process and an application containing a
collection of processes. For example, data is uploaded, an update query is run to bring a table up
to the latest version, and another query or report is run to produce the intended output.
Thinking of Excel workbook solutions as processes with unwavering steps to add inputs, run
calculations and produce outputs aligns well with the database mentality. Unintended variation
is the enemy of both data quality and accurate outcomes.
Practices to Avoid
We have touched on the better practices in Excel spreadsheet use and development. One may ask what
design practices should be avoided in order to improve control and increase efficiency and effectiveness
of spreadsheet applications. A sample may be:



“Bloatware” – in one sense, the definition of bloatware might be extended to include not just
unwanted computer-clogging programs, but also unnecessary regions, pages, historical or ad
hoc sections of a spreadsheet. Like aging humans, sometimes workbooks spread out in
unhealthy ways with accumulated one-off or temporary analyses and supporting data that is
irrelevant to the stated purpose of a critical application in Excel. Examples also include excessive
periodic data, side calculations done for a one-time check, abandoned subprocesses, and “FYI”
items. This condition tends to be found more in older files. One of the first risk indicators for a
critical file will be its age, given this tendency to be packrats of data or analyses, which may be
important enough to save, but would be better off in a separate file, appropriately named and
documented.
Opacity – One antonym to the desirable attribute of transparency is the usually unplanned
behavior of hiding or not showing work that goes into a final output. This unhealthy tendency
manifests itself in hidden ranges (columns, rows and tabs), but also in a subtler overcomplication of functions and a lack of documentation. Formulas that are overly complex are
almost guaranteed to make all but the most determined critical eyes glaze over and pass on to
something else. Skipping explanation of the input sources and how they are used, including any
special functions (by special, consider those more sophisticated than normal math) tends to
increase the odds of user misunderstanding and therefore misuse of the application. (A side
benefit of properly documenting workbooks is reducing audit costs since auditors can follow
what is going on with fewer questions for accounting staff. Employee transition upset is also
reduced by good documentation.)
Illogical flow – Similar to the stream of consciousness or organic programming, a map of sheet
connections might show a drunken spider web of links, with some worksheets service as both
sources precedents and dependents of each other. This circularity or seeming arbitrariness
increases the complexity of the application and thereby makes it more difficult to discern its
logic. It also may slow calculation or processing speed in larger files. For an antidote, recall the
planning phase and (for more complex processes, use a project flow diagramming tool such as
Microsoft Visio© to fully understand what is supposed to be happening in the workbook).




Avoid overuse of volatile functions -- The common functions such as @NOW and @ROW can be
handy, but the abuse of these expressions will slow performance. Array functions are also
double-edged swords, very useful for auditors in particular, who are looking for an alternate
method to recalculate, but are also resource hogs.
Unlabeled and ad hoc calculations – Occasionally, one needs to do a quick double-check or
reasonableness test on some data. The temptation to slide over to an open area on the source
tab and pop in some formulas is very real, but is antithetical to good spreadsheet hygiene after
the temporary section has served its brief purpose. Sometimes the purpose of the behavior is an
ongoing control of some variable section of data. Instead, if the calculation is actually useful or
could be made into an ongoing control, consider institutionalizing them in a dedicated control
tab. Resist adding insult to injury that comes from just hiding these stray areas.
Speaking of labeling, do not leave regions of data or formulas unlabeled -- There is an
underlying assumption that “everyone” will know what the data or formulas represent. Refer
back to the “database mentality” section above in which it is obvious that considering columns
like fields in a database would prevent the lack of labeling, since fields cannot even be set up
without a name for the data. By the way, a preferred way to create a label that contains
multiple words is to put them all in one cell at the top of the section, then applying “wrap text”
formatting. All too often users will enter longer labels in multiple rows. Doing so will prevent the
best use of filters, as well as increase the number of header rows.
Drawing a blank – A number of unhealthy spreadsheet habits can be traced back to the lack of
modular design. Lacking appreciation for this technique can lead to formatting data or
calculation heavy areas to also serve as outputs or reports by inserting blank rows or columns.
Again, the effects include spreading the active section of the worksheet, but more ominously,
the blank rows are often referenced in formulas, increasing risk significantly of inadvertent
entry. Typically, this behavior reflects the need to add new records in an ensuing accounting
period, for example, but there are better ways to make this entry easy without increasing risk,
such as a simple data validation for the blank row (must equal a whole number = 0). This
placeholder marks the row where new rows can be added that will be included in the summary
formulas.
Conclusion
The purpose of this paper has been to share some information for spreadsheet developers and users to
attain higher levels of excellence by reducing the risks inherent to user-developed applications. The time
is now for boosting our knowledge base of risky Excel behaviors and their antidotes. Like international
accounting standards, good Excel practices are more principles- than rules-based, which makes it easier
to learn and more flexible to adapt to the myriad use cases encountered in four main spreadsheet
domains of financial, analytical, operational and regulatory applications. The key to safer computing in
Excel is knowing more and using more of the risk-reducing principles. Being at the decision-making table
as professionals means taking proper safeguards even while learning and applying the coolest tools
available. Make every risk you take a calculated one.
Appendix A Spreadsheet Standards
FAST http://www.fast-standard.org/
“Models should be Flexible, Appropriate, Structured and Transparent”
SSRB
http://www.ssrb.org/
“Custodians of the Best Practice Spreadsheet Modelling Standards”
Appendix B Spreadsheet Diagnostic, Discovery & Management Tools
Incisive suite http://new.incisive.com/
Cimcon suite http://www.sarbox-solutions.com/main/index.asp
ClusterSeven http://www.clusterseven.com/spreadsheet-management/
Download