Scanner Data Project - The experience of Portugal

advertisement
Scanner Data Project
The experience of Portugal
Paulo Saraiva dos Santos
Data Collection Department
Scanner Data Workshop
7-8 June 2012
Stockholm, Sweden
«
Outline
•
•
•
•
•
Background and context;
Portuguese Scanner Data project;
Collaboration with data providers;
Lessons learned;
The way forward.
2
Statistics Portugal
• Central national authority for the production
of official statistics;
• It is legally empowered to require
information (mandatory and gratuitously);
• Survey’s data collection is a core activity:
– 40% budget & 30% human resources.
Survey Data Collection is a core activity
3
Data Collection
• A Data Collection department assures the collection,
processing and analysis of collected microdata, covering
all business and social surveys;
 HR  200 workers + 350 freelance interviewers
Annual figures
 81 surveys and 700 occurrences
 66 business (self-completed)
 15 by interview (CAPI and CATI).
 125.000 companies (*) (99% SME);
 70.000 dwellings(*);
 30.000 farms.
4
Portuguese CPI
•
•
•
The prices are mainly collected using paper
questionnaires, and the same price collector
who observes the price makes the data entry
at home;
Afterwards, the prices collected by a certain
price collector are daily sent to local offices,
where they are processed and analyzed;
Then, regional indexes are calculated and sent
to the Lisbon office to be aggregated and
consolidated.
5
Portuguese CPI
• Annual data collection figures:
– 1.355.000 prices;
– 11.000 outlets;
– 150 price collectors (freelance interviewers)
• Collection cost = 740 k €,
– 83% are costs of the field team;
6
National
Accounts
Economics
Social and
Demographics
Methods and Information
Systems
Production architecture
CPI
Data Collection
7
Scanner Data Project
•
Scanner data in the Portuguese CPI has a
high priority for the next years.
–
•
As consequence of Statistics Portugal strategy to
collect data efficiently, assuring high level of quality
and with the lowest response burden, the effort of
implementing
Statistics Portugal has been awarded in 2011
a Eurostat grant to undertake the initial
research on the exploitation of scanner data
from the period of 2011-13.
8
Lines of actions
1.
2.
3.
4.
•
Knowledge acquisition;
Collaboration with data providers;
Pilot project;
Infrastructure.
This paper presents the present status of the
scanner data project, especially our
experience to date on obtaining collaboration
with data providers.
9
Knowledge acquisition
• To access data on product characteristics for
products covered by scanner data:
– the use of European Article Number (EAN) and its
linkage with COICOP classification and in-store
codes;
– to learn from experiences of other countries using
scanner data.
• Two visits:
– CBS (The Netherlands)  September 2011
– FSO (Switzerland)
 October 2012
10
Pilot project
• to establish continuous scanner data flows
routines with selected retail chains;
• to develop the linking of the aggregated and the
product-level codes found in scanner data to
COICOP statistical classification;
• to develop internal data methods for storing and
processing scanner data based on a
datawarehouse and data mine approaches;
• to develop sample designs and weights,
including methods to integrate scanner data with
the existing price collection processes;
11
Infrastructure
• to design and build an information system to
support the pilot project.
12
Collaboration with providers
• to explore and to negotiate arrangements to
access scanner data from retailer’s chains, and
• to select data providers for a pilot experience;
• a simplification was adopted by limiting the
scope to the purchase of food and non
alcoholic beverages, which represents
approximately 15% of our CPI fixed basket of
goods and services.
13
Target data providers
• There are five chains that dominate the sector;
• Estimated structure of the sector shows the
following market share (food and non alcoholic beverages):
– Modelo Continente (31%); Pingo Doce (30%);
Auchan (14%), Lidl (10%), Minipreço (8%), and
others (7%).
• Pioneers:
– top two retail chains:
– Both with 60% of the national share;
– 40% of prices and outlets collected in HICP in food
and beverage class.
14
SONAE Group
• First meeting in November 2011;
• The results were extremely positive and it was
possible to create the conditions to establish a
formal protocol with this retailer, which was
formalized in the beginning of the year 2012.
• The two first sets of scanner data were received
in April and May 2012, covering information from
two consecutive months of 2012
– Including list of outlets, identification and characteristics of the
items, and accumulated transactions (quantities and turnover) in
reference period.
15
Jerónimo Martins Group
• First meeting in December 2011;
• The cooperation was accepted in March
2012, and two bilateral meetings are
established.
• First data set expected in June 2012.
16
Approaching retailers (1)
• Statistics Portugal is legally empowered
to require statistical information
(mandatory and gratuitously) to all
companies;
• This is a strong advantage, but it is
considered insufficient to motivate data
providers to be involved free of charge in
the scanner data project.
17
Approaching retailers (2)
• It is difficult to demonstrate that scanner
data will reduce the statistical burden;
• There is a completely different perception;
– having our price collectors visiting their outlets is not
perceived as a relevant burden;
– Data providers do the same with their
competitors in order to compare prices;
• Key challenge  to find strong arguments
to convince retailers to join the project.
18
Approaching retailers (3)
• Other issue: what kind of information
would be required to the providers.
• The information is very sensitive;
• Mentioning that we are flexible, we opted
to show an example of the data structure;
– List of the outlets;
– Items description;
– Transactions.
• Approach appreciated by the providers.
19
Data Files structure (example)
20
Data Files structure (example)
21
First meeting approach
We found that the more convincing message in the
first face to face meeting was the following:
1.scanner data is the future and will be adopted in
our country;
2.the design of this new process has been started
and will define the way that retail chains will be
required to provide CPI data;
3.we offer the opportunity to the provider to
participate in the very beginning of the project,
influencing the design of the project in order to be
prepared in advance.
22
Lessons learned
•
•
•
Full support from the top management;
Working Group with a single leader;
It is difficult and takes time to reach consensus
internally about some CPI methodological
changes resulted from scanner data;
Key issues are not technical
•
–
–
•
•
Organizational & Change management;
Collaboration with data providers;
Step-by-step approach;
It is crucial to join first a top retail chain.
23
The way forward (1)
•
•
•
•
Create an internal Working Group on Scanner
Data, integrating members from prices unit, da
collection, information systems and
methodology;
Establish a working plan for 2013 to 2017;
First access to data from de two data providers
in a retrospective approach;
Evaluate ways to link products to COICOP
classification, specially using EAN and internal
store codes;
24
The way forward (2)
•
•
•
•
Link items collected traditionally and the
scanner data sent by the two providers;
Analyze and understand price variations
between scanner data and traditional
approach;
Specify and develop the application to support
the pilot project, which aims to establish
continuous scanner data flow routines with
selected retail chains;
Evaluate some changes in the CPI methods.
25
Scanner Data Project
The experience of Portugal
Thank you for your attention!
Paulo Saraiva dos Santos
paulo.saraiva@ine.pt
Scanner Data Workshop
7-8 June 2012
Stockholm, Sweden
«
Download