SEMANTIC TECHNOLOGIES FOR REGULATORY

advertisement
GRCTC Working Paper
December 2014
SEMANTIC TECHNOLOGIES
FOR REGULATORY
INTELLIGENCE IN THE
FINANCIAL INDUSTRY
Dr Tom Butler, Dr Elie Abi-Lahoud
and Dr Angelina Espinoza Limon
© GRCTC
1
Abstract
The financial industry is undergoing major
regulatory changes in all jurisdictions
across the globe. The growth in the number
and complexity of regulations is causing
major problems for the industry. Extant
governance risk and compliance (GRC) systems and traditional business intelligence
(BI) systems are not serving the needs of
the most systemically important sector for
the global economy. Increasingly, the industry is looking to semantic technologies
to understand complex regulations. This
paper presents the results of an ongoing
research initiative to develop semanticallyenabled regulatory intelligence capabilities
to underpin regulatory compliance change
management in the domain of anti-money
laundering (AML). This ontology-based
system enables innovative semantic tagging and querying of complex regulatory
texts and associated rules which have been
extracted, transformed and loaded into an
RDF triple store. The design research methodology by which this is achieved is described, as is the progress in both the Design and Relevance Cycles of the design
science research approach being adopted.
The paper ends by describing future steps
and significance of this research for regulatory intelligence and regulatory compliance
change management initiatives.
Introduction
The financial crisis in 2008 had disastrous
consequences for the world economy
(Campbell, 2011). Regulators across the
globe responded rapidly to this situation by
instituting a raft of new regulations and
strengthening existing regulations (Grant
and Wilson, 2012). A recent commentary in
The Economist outlined the current challenges facing the financial industry: “Bankers had
hoped that, after seven years of penance for
their part in the financial crisis, the end of
wrenching overhauls forced by fierce new
regulations might be nigh. But to their dismay, the regulators’ zeal is undimmed. Far
from giving banks respite, they are toughening up old rules and devising new ones, perhaps heralding a new wave of restructuring.”1
Comparing US Banking Reform Dodd-Frank
Act (2010) with previous legislation: it
stands at 2,319 pages, while in comparison
the Sarbanes-Oxley Act (2002) is just 66 pages. However, Dodd-Frank is being translated
in over 400 regulatory rules which will fill
over 40,000 pages of regulatory text which
will be published across various Titles in the
Code of Federal Register (CFR). Currently
only 200 regulatory rules are published and
the international financial industry is in considerable confusion. Taking Based on data
published by Figure 1 illustrates the restrictions in Titles 12 and 17 of the CFR from
2010. In 2010, before Dodd-Frank, the US
Code of Federal Regulations Titles 12 and 17
combined had a total of 343 restrictions. As
of end 2012, the amount of restrictions increased to 3172.
As large and onerous as Dodd-Frank is, similar increases the number and complexity of
regulatory rules are to be found across all
regulatory regimes, including the European
Union. Thus, industry experts warn of a
“looming train wreck in risk management,
regulatory
compliance
and
reporting” (Kendall, 2013). This is significant challenges for governance, risk and compliance
(GRC) for organisations in the financial industry and for GRC vendors.
1
http://www.economist.com/news/finance-and-economics/21620225-big-banks-prayers-halt-newregulation-have-fallen-deaf-ears-no-respite : Accessed November 2014.
2
Figure 1. Financial Restrictions in the CFR 2010-2012 (Adapted from McLaughlin and Greene 2013)
There is a dearth of information systems (IS)
research on GRC for financial services; however, one study of note out comes from Kenneth Bamberger (2010), who drew on IS perspectives in his analysis of the failures in GRC
practice and related information systems
leading up to the crisis. Bamberger (2010, p.
706) concluded that GRC-related IS failures
were due to “problems of translation…of
both legal mandates and business understandings of risk into computer code and
actionable controls.” This problem continues
in that Kendall (2013) reports that “most of
the largest banks understand that they not
only have inadequate capabilities to address
the regulations that have already been imposed, but that even more regulation is inevitable.” Our ongoing research supports her
conclusion on another looming train wreck
for the financial industry. Tangible evidence
for this state of affairs is found in the significant fines being imposed by regulators
across the industry for operational risk
events such as anti-money laundering (AML),
LIBOR rigging, fraud events, and, more recently, and the Forex rigging scandal in the
UK.
The Governance Risk and Compliance Technology Centre (GRCTC) was founded to conduct R&D on the use of semantic technolo-
gies for governance risk and compliance in
the financial industry. In late 2012, a number
of Global Systemically Important Banks (GSIBs) and a leading GRC vendor to the financial industry identified operational risk and
its sub-domain of AML risk, in particular, as
areas that required urgent research attention. They argued that existing business intelligence and knowledge management systems (KMS) were not addressing their concerns and they therefore sought an innovative solution based on semantic technologies
(cf. Declerck et al., 2007; Sheth,, 2005)),
which they argued overcame the limitations
of traditional IS for GRC. There is however a
nascent body of research on IS support for
requirements engineering in legal domains.
Here researchers have proposed concepts
and models of regulatory texts and have created conceptual models of laws and regulations (cf. Zeni et al., 2013). However, there is
little evidence that such models or underlying concepts have been subjected to evaluation (Design Cycle) or been subjected to field
test (Relevance Cycle), although the work of
Zeni et al. (2013) on their GiausT looks
promising.
3
However, the field of regulatory science is
well developed in the life-sciences and the
pharmaceutical industry in particular
(Hamburg, 2011). Semantic technologies
are being applied across the fields of medicine and pharmacology (Gomez-Perez et al.,
2013), and more recently to automate regulatory compliance in pharmaceutical manufacturing (Sesen et al. 2010). More recently, this has given rise of the concept of regulatory intelligence (Badreddin et al., 2013),
as traditional business intelligence tools
and techniques do not address the specific
challenges posed by the regulatory domain.
This paper presents the findings of our design science research initiative on the development of semantic technologies to enable the regulatory intelligence capabilities
to underpin regulatory compliance change
management in the Financial Industry. The
paper describes how regulatory ontologies
are being developed at the GRCTC to enable
regulatory texts to be queried in order to
help GRC executives answer questions such
as ‘What are the various restrictions in an
individual instrument of legislation or a
regulatory rule?’ Likewise its semantic
technologies have the ability to query legislation and regulatory texts and identify obligations, derogations, exemptions, exclusions, and so on. These semantic technologies also enable “simpler” questions based
on meta-data related to regulations, such as
agency, enforcement type(s), dates, etc., to
be answered. It is planned to develop regulatory intelligence systems based on this
research in order to informate the development of governance policies, risk management strategies and compliance reporting
and a new generation of regulatory compliance knowledge management systems
(RKMS).
The remainder of this paper is structured
as follows. The following section describes
our design science research approach. The
third section describes our research in progress on our nascent R&D on regulatory
compliance change management system.
The final section describes on-going R&D
towards the completion of this project.
Design Science Research Approach
The motivation for and research object of
this study’s research in progress has been
described. In positioning our research we
look to Winter (2008, p. 471), who states
that design research (DR) is aimed at
“creating solutions to specific classes of
relevant problems by using a rigorous construction and evaluation process.” Winter
(ibid.) indicates that “design science reflects the design research process and aims
at creating standards for its rigour.” We
therefore classify our research-in-progress
project as Design Research (DR) as it concurs with Winter’s (2008) conception of
this type of research. The design artefacts
being produced in this study include: (a)
Constructs (i.e. concepts in an ontology);
(b) Relationships between, and axioms that
govern, these constructs; (b) Models (in
Web Ontology Language (OWL2) represented in Protege); and (d) Methods (an
approach to the construction of concepts,
relationships, axioms and models). According to Hevner (2007) design science research should include: (a) a Design Cycle,
which involves the essential activities of
developing and evaluating the design artefacts and research processes; (b) a Rigor
Cycle, which connects the design cycle with
a knowledge base of scientific theories, experience & expertise, and meta-artefacts;
and (c) a Relevance Cycle, which incorporates interactions between the environment of the problem domain and the core
design activities (cf. Hevner et al., 2004).
Each of these cycles were incorporated into
our design science research.In the DR project described below, the Rigor Cycle was
underpinned by Design Science (DS) theory
based on Formalism (West, 2004), which
adhered to the Bunge-Wand-Weber (BWW)
Ontology (Wand and Weber, 1993, 1995,
2002), knowledge engineering principles,
and in particular the formalisms underpinning the application of the Web Ontology
Language (OWL2) published by the W3C.
We also align our DR with standards published by the Object Management Group
(OMG), particularly the Semantic of Business Vocabulary and Business Rules (SBVR)
standard and the OMG and Enterprise Data
Management (EDM) Council’ Financial Industry Business Ontology (FIBO) standard
(Bennett, 2011, 2013).
4
The Relevance Cycle in this project involves
regular feedback and demonstrations to
GRC executives and GRC application vendors, as well as progress reports to the
OMG’s Financial Domain Task Force, which
consists on members of the OMG Ontology
SIG, SBVR SIG and subject matter experts
from the financial industry globally. The
relevance of our theoretically informed DR
project is therefore ascertained. We now
outline our Design Cycle Activities.
Regulatory Intelligence and Regulatory Compliance Change Management Systems
Current solutions for regulatory intelligence and regulatory change management
rely on highly expensive, labour-intensive
analysis of legislative and regulatory text
by subject matter experts. Several aspects
of these time-consuming tasks could be
automated using appropriate sematic technologies. The objective of this design research project is to leverage semantic technologies to assist subject matter experts
(SMEs), be they lawyers or GRC officers or
banking executives, in making sense of the
wide and complex spectrum of legal documents, regulatory texts, and other rulebased sources in order to perform better
regulatory change management, more effective governance policies, enhanced risk
management, and relevant compliance reporting. More precisely, we are combining
several semantically-informed techniques
in a system that provides the capability to
answer such important but technically elusive questions such as:

What are the compliance imperatives (obligations, prohibitions etc.)
in a regulation or rule and where do
they appear?

How can semantic technologies support regulatory change management?
Our DR approach consists of creating and
populating the Financial Industry Regulatory Ontology (FIRO) in OWL2 consisting of
fundamental regulatory and domain concepts using a combination of text analytics
techniques and subject matter expertise.
OWL, or the Web Ontology Language, is the
schema language, or knowledge representation (KR) language, of the Semantic Web.
The resulting Knowledge Base is persisted
in a Resource Description Framework
(RDF) triple store. RDF or the Resource
Description Framework is the data modeling language for the Semantic Web. All Semantic Web information is stored and represented in the RDF. The RDF triple store
can then be queried using SPARQL, to answer questions such as the ones described
above. SPARQL is the Sparql Protocol and
RDF Query Language, the query language of
the Semantic Web. It has is specifically designed to query data across various systems.
Figure 1, illustrates the four phases of our
DR methodology. Here several innovative
techniques developed by the researchers
and related semantic technologies are combined for the purpose of developing regulatory intelligence tools require for a working
prototype of an RCMS. First, is the ontology
engineering phase. Here SMEs create a regulatory vocabulary and a reuse it to capture
the regulatory intent in a rulebook. The
output of this stage is then used by applied
by Knowledge Engineers, or Semantic
Technologies Experts (STEs) as we term
them, to create a family of formal ontologies. This phase is supported by the Semantics of Business Vocabulary and business
Rules (SBVR), which is being applied using
an innovative methodology developed at
the GRCTC.
5
Figure 2. Phases of the Design Research on Enhance BI for RCMS
The Financial Industry Regulatory Ontology
(FIRO) contains the following ontology
family members. FIRO-H contains highlevel concepts in the regulatory domain,
such as prohibitions, obligations, derogations etc. These are aligned with the SBVR
standard grammar. FIRO-S is an ontology
that captures the semantics and structure
description of legislative and regulative
texts according to the Akoma Ntoso standard. FIRO-AML describes the concepts that
captures the semantics and axioms that of
the Anti-Money Laundering domain. FIRORCM is an operational ontology that captures the entire semantics and axioms of
regulatory texts to guide the ontology population process.
In the second, phase legal SMEs manually
annotate using FIRO-RCM concepts a set of
AML documents used as training for the
automatic classification algorithms. In the
third phase, several classification algorithms are executed in order to populate
the ontology, or in other words, tag the regulatory text with concepts from the ontology. This places a semantic structure on such
texts that did not previously exist. These
are then stored in an RDF triple store. The
final phase involves the design of an application that is used to query the content of
semantically tagged regulatory texts stored
in the knowledge base using a SPARQL endpoint.
Regulatory Intelligence in
Action
Having developed the demonstrator in the
first phase of the RCMS, we entered the
DSR Relevance Cycle where the prototype
application was demonstrated to executives from the financial industry and technology sectors. To explain this, we first illustrate the target text—The 2007 UK Money Laundering Regulation. Here in figure 3
we see an excerpt—Section 15 (tagged in
FIRO-S) which has been tagged as an Obligation (FIRO-H), and which covers AML
concepts such as Record-keeping, Customer
Due Diligence and Ongoing Monitoring.
6
Figure 3 Example of a Structured Regulatory document containing Unstructured Regulatory Data
Figure 4 illustrates the application query
interface. In this case several prestructured queries are presented. These are
submitted to the SPARQL endpoint which
then returns the result. The queries have
several parameters, such as to list Obligations with certain parameters attached.
Obligations that relate to the AML concept
of Customer Due Diligence, Monitoring,
Reporting, and so on. The location of the
Obligation is returned by default (e.g. Section, Sub-section, Page etc.). However, the
actual text as in Section 15 above may also
be presented.
Figure 4 Querying the Semantically Enriched Text
7
Figure 5 Regulatory Intelligence Output in Excel
Figure 5 illustrates the result of the query
in an Excel spreadsheet for further analysis.
Here all the AML categories which carry an
Obligation are presented, as is their Section
and Sub-section. The text describing the
obligation is next displayed. Feedback
from the financial industry and technology
sector on the first phase prototype was extremely positive. With minimum reTraining the application was applied to the
US Bank Secrecy Act, 31 CFR B Chapter X,
which covers AML. We were more than
please with the query results, which contained a remarkable number of positive
and accurate results. This bodes well for
the future uptake of our research by industry.
Future Work
This short working paper described the
implementation of an approach to semantic
tagging of regulatory documents for the
purpose of regulatory change management
in the financial industry. Early results are
promising, We are currently designing and
developing a set of user interfaces for interactive data curation by SMEs. Extending the
role of SMEs beyond the preparatory phase
and keeping them in the loop at every stage
of the prototype execution is the aim of ongoing work. We have found that semantic
technologies can be calibrated, thorough
the application of domain-specific taxonomies and ontologies, to query (as opposed
to simple word search) unstructured texts
for specific categories of risk data. We argue that this is a significant development
for regulatory intelligence, as vital risk data
is often buried as unstructured facts in
texts entries or memo fields in databases,
Excel spreadsheets and so on. This creates
significant problems for risk analysis and
compliance reporting. Thus, the output of
our research and development enables better regulatory intelligence throughout the
regulatory compliance value chain as related working papers in this series outlines.
8
References
Badreddin, O., Mussbacher, G., Amyot, D.
Behnam, S.A., Rashidi-Tabrizi, R.. Braun, E.,
Alhaj, M. and G. (2013). Regulation-Based
Dimensional Modeling for Regulatory Intelligence. In Requirements Engineering and
Law (RELAW), 2013 Sixth International
Workshop on, pp. 1-10. IEEE.
Bennett, M. (2011). Semantics standardization
for financial industry integration. In Collaboration Technologies and Systems (CTS),
IEEE, 23-27 May 2011, 439-445.
Bennett, M. (2013). The financial industry business ontology: Best practice for big data.
Journal of Banking Regulation, 14(3-4), 3-4.
Declerck, T., H.-U. Krieger, B. Kiefer, M. Spies and
C. Leibold (2007). Integration of semantic
resources and tools for business intelligence.
International Workshop on Semantic-Based
Software Development held at OPSLA 2007.
Gomez-Perez, A., Martinez-Romero, M., Rodriguez-Gonzalez, A., Vazquez, G. and VazquezNaya, J. M. (2013). Ontologies in medicinal
chemistry: current status and future challenges. Current topics in medicinal chemistry, 13(5), 576-590.
Grant, W. and Wilson, G. K. (Eds.). (2012) The
Consequences of the Global Financial Crisis:
The Rhetoric of Reform and Regulation. OUP
Oxford.
Hamburg, M. A. (2011). Advancing regulatory
science. Science, 331(6020), 987-987.
Hevner, A. R. (2007). The three cycle view of
design science research. Scandinavian journal of information systems, 19(2), 87-92.
Hevner, A.R. March, S.T. and Park, J. (2004). Design Science in Information Systems Research, MIS Quarterly, 28(1), 75 – 105.
Hindmoor, A. and McConnell, A. (2013) Why
Didn't They See it Coming? Warning Signs,
Acceptable Risks and the Global Financial
Crisis. Political Studies DOI: 10.1111/j.14679248.2012.00986.x.
Kendall, E. (2013). Semantics in Finance: Addressing Looming Train Wreck in Risk Management, Regulatory Compliance and Reporting. Semantic Technology and Business
Conference, Oct 2-3, 2013. http://
semtechbiznyc2013.semanticweb.com/
sessionPop.cfm?
confid=76&proposalid=5402
KPMG (2012). The Convergence Evolution :
Global survey into the integration of governance, risk and compliance http://
www.kpmg.com/ES/es/
ActualidadyNovedades/
ArticulosyPublicaciones /Documents/TheConvergence-Evolution.pdf
McLaughlin, P. and Greene, R. (2013). DoddFrank: What It Does and Why It’s Flawed, Mercatus Center, George Mason University.
Sartor, G., P. Casanovas and M. Biasiotti (2011).
Approaches to legal ontologies: theories,
domains, methodologies, Springer.
Sesen, M. B., Suresh, P., Banares-Alcantara, R.
and Venkatasubramanian, V. (2010). An
ontological framework for automated regulatory compliance in pharmaceutical manufacturing. Computers & Chemical Engineering, 34(7), 1155-1169.
Sheth, A. (2005) Enterprise Applications of Semantic Web: The Sweet Spot of Risk and
Compliance. Invited paper: IFIP International Conference on Industrial Applications
of Semantic Web (IASW2005), Jyvaskyla,
Finland, August 25-27, 2005. http://
www.cs.jyu.fi/ai/OntoGroup/IASW-2005/
Tudorache, T., Nyulas, C., Noy, N. F., and Musen,
M. A. (2013). WebProtege: a collaborative
ontology editor and knowledge acquisition
tool for the Web. Semantic Web, 4(1), 89-99.
Wand, Y. and Weber, R. (1993). On the ontological expressiveness of information systems
analysis and design grammars. Information
Systems Journal, 3(4), 217-237.
Wand, Y. and Weber, R. (1995). On the deep
structure of information systems. Information Systems Journal, 5(3), 203-223.
Wand, Y. and Weber, R. (2002). Research commentary: information systems and conceptual modeling—a research agenda. Information Systems Research, 13(4), 363-376.
West, D. (2009). Object thinking. O'Reilly Media,
Inc..
Winter, R. (2008). Design science research in
Europe. European Journal of Information
Systems, 17(5), 470-475.
Zeni, N., Kiyavitskaya, N., Mich, L., Cordy, J. R.
and Mylopoulos, J. (2013). GaiusT: supporting the extraction of rights and obligations
for regulatory compliance. Requirements
Engineering, 1-22. DOI 10.1007/s00766-013
-0181-8.
9
About the Authors
Tom Butler, PhD, is Principal Investigator of the Financial Services Governance Risk and Compliance Technology Centre (GRCTC). With funding of 5 million euro from the Irish Government,
the GRCTC conducts research on the design, development and implementation of semantic
technologies for governance, risk and compliance (GRC) in the financial industry globally. Tom
has 111 publications since joining academia in 1998. He is currently ranked 33rd out of the top
100 Association for Information Systems (AIS) Senior Scholars and researchers globally.
Elie Abi-Lahoud, PhD, has designed innovative technologies for enterprise solutions. He plays
a key role at the GRCTC which is engaged in applying semantic technologies for GRC in financial services. In this role, Elie works with the Object Management Group (OMG), the Enterprise
Data Management Council (EDMC), and thought leaders in the financial industry, on a common vocabulary capturing shared domain understanding and on improving regulation-aware
decision-making.
Angelina Espinoza Limón, PhD, is a Visiting Professor at the GRCTC. A former Software Engineer, Angelina conducts research on the design, development and implementation of semantic technologies for regulatory compliance in the financial industry. Here she builds on her
experience in developing RDF/RDFS and OWL ontologies and business rules for supporting
semantic interoperability for the Smart Grid. Angelina has over 27 publications.
© GRCTC
10
Download