6 & 7 May 2010

advertisement
6 & 7 May 2010
Committee for the Coordination of Statistical Activities
Conference on Data Quality for International Organisations
Data Quality Management for Securities
–
the case of the
Centralised Securities DataBase
Francis Gross
DG Statistics, External Statistics Division
Financial statistics based on micro data
• Statistics based on individual securities data
– Micro-data generated from business processes
– Classifications by statisticians
• Benefits
– Flexibility in serving event-driven policy needs, near time
– Ability to drill down, linking macro- to micro-issues
• Tool: the Centralised Securities Database (CSDB)
– Holds data on nearly 7 million securities
– Is in production with 27 National Central Banks online
Securities data
Issuer data
Holdings data *
Macro-statistics on
“who finances whom”
and “how”
* aggregated by economic sector and country of residence
2
Centralised Securities Database (CSDB)
• CSDB provides consistent and up-to-date reference,
income, price and volume data on individual securities
• CSDB is shared by the European System of Central Banks
(ESCB)
• CSDB is intended to be the backbone for producing
consistent and harmonised securities statistics
• CSDB plays a pivotal role in s-b-s* reporting as the
reference database for the ESCB & associated institutions
* s-b-s: security by security
3
Main features of the CSDB
• Multi-source system
 For coverage and quality: data from several providers
(5 commercial data providers, National Central Banks (NCBs))
• Daily Update frequency
 2 million price- & 1 million reference data records per day
• Automated construction of a “golden copy”
 Data on an instrument are grouped using algorithms;
 most plausible attribute values are selected
• Data Quality Management (DQM) Network
 Staff from all NCBs contribute to DQM
 Through human intervention and increasingly systems, to
 Check “raw” data and validate “golden copy” results
4
Data quality – two pillars
• Data quality managers face two critical problems:
– What data is correct? Where to look for the “truth”?
(verify vs prospectus, internal databases, Google…)
– How can dubious data be identified?
• The data quality of the CSDB depends heavily on:
Data Quality
Management
•
•
•
•
Tackle issue at NCB
DOWNSTREAM
Individual securities
Decentralised process
Data Source
Management
•
•
•
•
Tackle issue at source
UPSTREAM
Preferably in bulk
Centralised process
5
Metrics to steer and prioritise DQM
Strategy: identification, quantification, prioritisation
• Problem: no access to benchmark data
• Macro metrics:
– distribution of different reference data attributes
(eg price, income)
• Micro Metrics:
– inter-temporal comparison (stability index,
concentration change)
– consistency checks
– can drill down to level of individual security
6
Example: metric for change in country / sector
• The system calculates indices of stability in country / sector for
the relevant group of securities between t0 and t1
• An index for a country / sector pair below 1 shows change in
the group of securities (country or sector or both)
• 2 indices are calculated: from the perspective t0 (sees leavers)
and from t1 (sees joiners) in a country/ sector group
T0 NL/S.11 Security 1
t0 perspective
leaves
(Laspeyres concept)
sector/
covers leavers but no
country
joiners
T1 NL/S.11
Security 2
Security 3
stays
stays
joins
sector/
country
Security 2
Security 3
Security 4
 Join both concepts  Fisher index =
t1 perspective
(Paasche concept)
covers joiners but no
leavers
index Laypeyres  index Paasche
7
19 instruments, of which 14 kept their issuer identifier (CH_IDENT = 1)
8
CSDB Data Quality Management
Made us face a choice:
either
SYSYPHUS
or
CHANGE,
Change beyond Statistics
9
Finding data quality further upstream
Data Quality
Management
& defaulting
Compounding
a costly process
Prospectus
“perfect”, public
data source, but
no common
language
costly
Candidates
for compounding,
a costly collection
Commercial data sets
error prone, selective, costly
production, duplicate efforts,
proprietary formats
Compound
first shot
Duplication and nonstandardisation in the very data
generation process hamper the
whole downstream value chain.
Golden Copy
after DQM & defaulting,
still not back to perfect
- and not standard.
10
Where to start?
The first layer of
data out of reality.
Its generation
process matters
15
Data capture drives IT output quality
• Once good data is in the system, processing can work well.
• Data capture from the “real” world is the key step.
• Once lost at capture, information in data is lost.
• No “data cleaning” will help: data must be captured again.
• Messy data capture at source is very expensive downstream:
– Most applications perform badly
– “Data cleaning” and fixing failed processes are costly for all
– Processes and IT must be designed in complicated ways
Large scale IT processing can be simple and cheap when
data fulfils the programmers’ quality assumptions.
Messy data capture delivers “garbage in, garbage out”.
16
Progress
is on its
way
19
“…a standard for reference data on
securities and issuers, with the aim of
making such data available to policymakers, regulators and the financial
industry through an international public
infrastructure.” (J.C. Trichet, 23.2.09)
20
The industry expresses demand for a Utility
• Industry panel at Conference 15 Feb 10 in London:
– “An international Utility for reference data has its place, but
– Keep it simple, (concept of a “Thin Utility”)
– Ask industry to design the standards (ISO does exactly that) and
– Give us the legal stick”
A viable reference data infrastructure
benefits from constructive dialogue.
25
From browsing to
farming for data:
The long way to
standardisation
26
Climbing the stairway to action
Build into data ecosystem
Design a legal framework
Imagine solutions addressing legacy
Accept the issue among priorities
Build the business case with all stakeholders
Imagine a feasible way; accept that way as useful
Understand dynamics of standardisation
Understand basic data as a shared strategic resource
Understand how basic data is generated
Understand the role of data as a necessary infrastructure
Business leaders, Policy makers, Regulators & Legislators
now embrace the dialogue with the Data Community
27
“Thin” Utility
38
“Thin Utility”: a unique, shared reference frame
•
•
•
•
Two registers: one for instruments, one for entities
Simple and light, complete and unequivocal
Hard focus on identification and minimal description
The shared infrastructure of basic reference data for:
–
–
–
–
Data users in the financial industry
Data vendors
Authorities
The Public
• An internationally shared infrastructure of reference
A “Thin Utility” provides the certainty
of a single source on known, bare basics.
39
Two reference registers: the Thin Utility’s frame
Register of entities
Register of instruments
•
•
•
•
unique identifier,
key attributes,
interrelations,
classifications,
40
Two reference registers: the Thin Utility’s frame
Register of entities
Register of instruments
•
•
•
•
•
unique identifier,
key attributes,
interrelations,
classifications,
electronic contact address
41
The Utility grows from a quickly feasible base
Register of entities
Register of instruments
For both registers:
• begin with feasible scope,
• grow over time by
• adding instruments & entities,
• adding attribute classes,
• driven by demand
• from industry and authorities,
• and by feasibility
42
The
international
aspect
52
Global Utility vs. National Law: an option
International Community
e.g. G20:
• Discusses new regulatory framework for financial markets
• Defines principles / goals for data
National Constituency
National Legislator
National Authority
Entity
Issues law:
• mandates national authority
• empowers it to enforce the
process and
• to farm out operations to an
international entity.
• (EU issues specific EU law)
International Institutions
Governance of the Operational Entity
• Global „tour de table“ (IMF, BIS, industry, etc.)
• Establishment of Int’l Operational Entity
• Seed funding of Int’l Operational Entity ?
International Operational Entity
Service agreements with
national authorities
Utility
Executes the legal mandate:
• Farms out operations to the
Int’l Operational Entity
• Monitors compliance
• Enforces, applies sanctions
Complies with national law:
• Delivers and maintains data
in the Utility as required,
possibly using services.
Runs the service:
• Collects data
• Distributes / sells data
• Certifies analysts
• Monitors compliance
• Informs national authorities
• Releases new standard items
Standards College
Ideally,
ISO-based
Develops/maintains standards:
• Designs initial standards
• Monitors market developments
• Steers evolution of standards
• Designs new standard items
53
Positioning
and
Design
56
Positioning in the data supply chain
Competitive market
Issuer
Policy makers
Regulators
Utility
Public
Option:
Utility as
Tier 1
CDP
Initially,
the downstream
supply chain remains
untouched, except for quality:
data users don’t need to invest
Data User
59
Utility value chain: monopoly vs. competition
Value chain
Standards:
- design
- setting
Analyst training
Analyst certification
Data production
Data distribution:
- primary
- secondary
Organisational model





Multilateral (ISO?)
Monopoly
Competition
Monopoly
Competition


Monopoly
Competition
Each stage of the value chain should
be given the most efficient organisation
61
Download