Nielsen Out of Home Reach and Frequency Methodology

advertisement
Nielsen Out of Home Reach and Frequency Methodology
Contents
Introduction
Section 1: Calculation Methodology for a Single Campaign Flight
1.1
1.2
1.3
Overview
Notation
The Process
1.3.1
1.3.2
1.3.3
1.3.4
1.3.5
1.3.6
1.3.7
1.4
Preliminary Analyses to Generate Inputs
Algorithm to Estimate Model Parameters
Calculating Outputs
Demographic Consistency
Negative Reach Build
Schedule Reliability
Sites with Zero Passage
Example
Section 2: Schedule Reach Calculation for 2+ Flights
2.1
2.2
2.3
2.4
Overview
Methodology and Example
Generating the frequency distribution
Three or more Schedules
Page 1 of 13
Introduction
This document describes the reach and frequency calculation methodology for
SAARF/Nielsen Outdoor data. These data are collected from GPS tracking (via the
Npod) of respondents’ travel patterns over a nine day data collection period. These data
are converted to schedule reach and frequency by cross-referencing with outdoor
advertising locations, and the application of a probability model to generate schedule
reach and frequency. This model allows the full frequency distribution to be calculated.
The first section describes the basic model as it applies to a single campaign flight, and
the second section describes how it is applied to multiple campaign flights of different
lengths.
Section 1: Calculation Methodology for a Single Campaign Flight
1.1
Overview
The methodology for calculation of reach and frequency employs the Gamma
Poisson Distribution (also known as the Negative Binomial Distribution, or NBD).
The methodology ensures consistency with published nine days GRPs and
reach, and generates GRPs, reach and frequency distributions for any number of
days. In summary, the methodology works as follows:
1.
Published (9-Day) GRPs are calculated for the schedule. These are
calculated using the Board Impressions File.
2.
A weighted reach and frequency analysis is generated using the intab1
reporting sample for Day 1. Those persons not intab on subsequent days
are deemed to have no exposures on those days. This analysis is used
to generate the initial parameters for the model.
3.
The initial model parameters are scaled to ensure that the model
produces results that are consistent with the published nine day GRPs.
4.
The model with the scaled parameters can then be used to generate the
schedule frequency distribution and total GRPs for the required number of
days.
1
An “intab” respondent is defined as a respondent whose data for a given day has passed through a quality
assurance process and has been included in the reporting sample.
Page 2 of 13
In addition, the algorithm also covers the case where the frequency distribution
reduces to a Poisson distribution, and for extreme cases where Reach is 100%
or 0%.
We recommend that single-precision floating point arithmetic is used. If fixed
point calculation is used, the values should be retained to at least two decimal
places if respondent weights are in units and five decimal places in the case of
weights in thousands.
1.2.
Notation
Gp
:
Published 9-Day GRP’s for the schedule. Throughout we assume
this to be greater than 0.
Gr
:
9-Day GRP’s calculated in the Set-up reach and frequency analysis
fr(0)
:
Weighted proportion of persons with zero exposures over 9 days
in the Set-up reach and frequency analysis
c
:
A constant used in the calculation of NBD parameters
x!
:
x factorial
ln (x)
:
Natural logarithm of x
abs(x) :
Absolute value of x
a, k
:
NBD Parameters
a , α
:
Scaled NBD Parameters.
t
:
Unit of time used in model.
d
:
Number of Days. NB: t=d/9.
λ
:
Poisson Parameter
fp(i)
:
Final Gold Standard proportion of individuals with i exposures to the
schedule, consistent with published 9-day GRPs
Gp(d) :
Final Gold Standard schedule GRPs for day d.
Page 3 of 13
1.3.
The Process
1.3.1
Preliminary Analyses to Generate Inputs
Analysis 1 9 Day GRPs
Adults 16+
The Board Impressions File is used to calculate Adults 16+ Total GRPs over nine
days, as follows:
a) Sum the daily passage estimates in the file for the selected sites and multiply
by nine. This gives Adults 16+ nine day impressions in 000s.
b) Divide the nine day impressions by the Adults 16+ population estimate and
multiply by 100 to obtain Adults 16+ GRPs. This figure, denoted Gp, acts as
an input into the Reach and Frequency Model.
Demographic Analyses
To generate the nine-day GRPs for input into demographic analyses, the
following preliminary results should be generated:
Smoothed Adults 16+ nine-day GRPs as described above
Survey-calculated Adults 16+ nine-day GRPs
Survey-calculated Demographic nine-day GRPs
=
=
=
A
B
C
Then the input into the Reach and Frequency model, Gp, is calculated as follows:
Gp
=
A/B x C
Analysis 2 Generate a Reach and Frequency analysis for 9 days data using the
weighted Intab sample for Day 1.
NB:
(i)
(ii)
Use Day 1 weights for each day.
If a person is not intab on any of days 2 – 9, the exposures for that person
are assumed to be zero.
This analysis is used to generate the following Inputs to the model:
a)
The set-up weighted 9-Day GRPs, denoted as Gr .
b)
The weighted proportion of persons with 0 exposures, denoted as fr(0).
Note that the inputs Gr and fr(0) should not be rounded or truncated.
Page 4 of 13
1.3.2
Algorithm to Estimate Model Parameters
The NBD is a two-parameter model. There are various ways of estimating the
parameters – one approach is outlined below. This approach also checks for
special cases where the standard model is inappropriate.
10
Algorithm
Comments
if fr(0)=0 then set fr(0)=0.001
Amend 100% Reach
if fr(0)=1, then set fr(0)=0.999 Amend 0% Reach
20
c = Gr /(100*ln(fr(0)))
Calculate ‘c’
30
if c ≥ -1 go to 60
Check for Poisson Condition
40
41
a = -2*(1+c)
b=a
a = c*(a-(1+a)*ln(1+a))/(1+a+c)
if abs(a-b)<0.0001 then go to 50
go to 41
Iterative procedure to estimate
NBD Parameter ‘a’
50
k = Gr /(100*a)
a  = a*Gp/Gr
α = 1/a 
go to End
NBD Parameter ‘k’
Scaled NBD Parameters
60
λ = Gp/100
Poisson Parameter
End
Page 5 of 13
1.3.3
Calculating Outputs
Negative Binomial Distribution
The full frequency distribution for time t is given by first calculating the probability
of zero exposures. The probability of n exposures is calculated as a function of
the probability of n-1 exposures as follows:
Probability of zero exposures
fp(0)
=
(α/(α + t))k
fp(n)
=
((k + n – 1)/n)*((t/(α + t))*fp(n-1)
For n ≥ 1:
Probability of n exposures
The average frequency AveF is given by:
AveF =
kt/α
Note that the units of t are 9 days.
Schedule reach over d days, expressed as a percentage, is given by
Reach
=
100*(1 - (α/(α + d/9))k)
GRPs are given by
GRP(d)
=
(d/9)*Gp(9)
Poisson Distribution
When the Poisson condition is indicated, the frequency distribution for time t is as
follows:
fp(i) = (λt)i e-λt/i!
Schedule Reach over d days, expressed as a percentage, is therefore given as
Reach = 100*(1- e-λd/9)
The GRPs delivered are given by:
Gp(d) = (d/9)*Gp(9)
Page 6 of 13
1.3.4
Demographic Consistency
Reach estimates from separate reach and frequency runs will not always be
additive. For example, a Men 16+ Reach estimate and a Women 16+ Reach
estimate will not necessarily be consistent with an Adults 16+ Reach estimate for
the same schedule. In cases where consistency is required, a “top-down”
balancing should be adopted. For example:
Adults 16+, Men 16+ and Women 16+ Schedule Analysis
Adults 16+
Men 16+
Women 16+
UE
1000
480
520
Initial
Reach (%)
50
60
40
Initial
Reach (000s)
500
288
208
The summed demographics reach in 000s is 288 + 208 = 496. To get consistency, the
reach estimates should be balanced to the total of 500:
Adults 16+, Men 16+ and Women 16+ Schedule Analysis
Adults 16+
Men 16+
Women 16+
1.3.5
Summed
Weights
1000
480
520
Initial
Reach (%)
50
60
40
Initial
Final
Reach (000s) Reach (000s)
500
500
288
290.3
208
209.7
Final
Reach
50.0
60.5
40.3
Negative Reach Build
The reach projection model can return an apparent negative reach build as
boards are added to a campaign. This is a phenomenon that affects all reach
projection models to a certain extent. Negative reach build is a manifestation of
forecast error in the model, and essentially means that there is no significant
difference between the reach of the two schedules.
1.3.6
Schedule Reliability
The table below presents estimates of standard error for campaigns of varying
sizes for adults 16+, based on the sample of 1933 respondents in Gauteng and
KZN. Estimates for demographics can be approximated from these data using
the inverse-square relationship of sample size and standard error: a
demographic that is a quarter of the sample will have double the standard error.
Page 7 of 13
More generally, the standard error for a demographic can be estimated as follows:
Demographic Standard Error =
Adults 16+ Standard error x Square Root (Adults 16+ Sample Size/Demographic
Sample Size)
Number of
Boards
10
20
50
100
150
200
28 Day
GRPs
52
105
261
523
784
1045
Standard
Error (%)
11
7
5
5
4
4
95% confidence interval
41 - 63
90 - 119
234 - 289
474 - 571
723 - 845
965 - 1125
As a guide, we would recommend that campaign analyses should require at least
40 separate respondents in the target demographic to be exposed to the
campaign. In practice this means that a few top-rated boards will support
analysis on their own for broad demographics, but a typical campaign aimed at
16-34’s (for example) would require five or more boards, and in some cases (eg
lower rated boards in rural areas) considerably more boards would need to be
selected.
1.3.7
Sites with Zero Passage
A minority of sites have zero passage (no sample respondents exposed to the
site in the fieldwork period). The Board Impressions File sets the Adults 16+
value for each board, and this will be extended later in 2007 to include
demographics. In the interim, some smaller campaigns will have zero
impressions for some demographics. In this case, Nielsen would recommend
estimating the campaign’s demographic profile by taking the average
demographic profile of the site types in the campaign. Note that this should be
an acceptable approach for sex/age demographics, but will be less reliable for
other characteristics such as race and income.
1.4
Example: Estimating Adult 35-49 GRPs for a 100 GRP Campaign
Consider a campaign aimed at Adults 35-49 that delivers 100 Adult 16+ GRPs,
composed of 60 GRPs on Super Signs and 40 GRPs on 96 Sheets.
From the survey data, Adults 35-49 GRPs index at 151 on Adults 16+ for all
Super Signs and at 143 for 96 Sheets. Adult 35-49 GRPs for the campaign are
given by:
Super Signs: 60 x 1.51 = 90.6
96 Sheets:
40 x 1.43 = 57.2
Total Campaign GRPs for Adults 35-49 = 147.8
Page 8 of 13
Section 2: Schedule Reach Calculation for 2+ Flights
2.1
Overview
This section specifies the methodology for calculating reach and frequency for
schedules with two or more components of different time periods. An example is
a four week run of bus shelters combined with two weeks of billboards.
The methodology outlined here produces the final delivered reach and frequency
of multi-flight schedules. Note that there is no need to go through this procedure
for two components of equal length (e.g. 4 weeks of bus shelters and 4 weeks of
billboards) – these can be combined and considered as a single schedule, even
if they occur at different times.
2.2
Methodology and Example
Example: We have two components to the schedule. Component 1 lasts four
weeks and Component 2 lasts two weeks. We calculate Reach and Frequency
using the NBD in the normal way and label them for subsequent use in
calculation:
Component 1:
Component 2:
Length
4 weeks
2 weeks
GRPs
1000
400
Reach
60
40
Label
R12
R21
GRPs
The first and easiest part is to calculate the combined GRP delivery, which is
simply the sum of the two components, in this case 1400 GRPs.
Reach
We first calculate the combined GRPs and Reach for the two components
combined, for 2 weeks and 4 weeks:
Components 1/2:
Components 1/2:
Length
2 weeks
4 weeks
GRPs
900
1800
Reach
. 60
70
Label
R31
R32
We also calculate Component 1 for 2 weeks, and Component 2 for 4 weeks:
Component 1:
Component 2:
2 Weeks
4 Weeks
500
800
45
50
R11
R22
Page 9 of 13
These different reach runs allow us to populate some of this table:
Component 2
Component 1
Period 1
Period 2, not
Period 1
Neither Period
40
10
50
Period 1
45
25.0
↑
Period 2, not Period 1
15
←
15.0
Neither Period
40
30.0
The data values are related to the labels as follows:
Component 2
Component 1
Period 1
Period 2, not Period 1
Neither Period
R11
R12 - R11
Period 1
Period 2, not Period
1
Neither Period
R21
R22-R21
100-R22
R11+R21-R31
←
↑
(R12+R22-R32)(R11+R21-R31)
100-R12
100-R32
With this information, we need to find the best solution to the following:
Period 1
R21
Component 1
Period 1
Period 2, not Period 1
Neither Period
R11
R11+R21-R31
R12 - R11
b
f
100-R12
Component 2
Period 2, not Period
1
R22-R21
Neither Period
100-R22
a
c
g
d
e
100-R32
Here is the solution: essentially we calculate a, b and c assuming first a random
duplication between the two components of the schedule, which we then factor to fit the
results of our combined schedule analysis. We then derive d, e, f and g directly (in fact
we really only need f).
Page 10 of 13
First, we ensure consistency between the individual component analyses and the
combined analysis.
100
200
Set
k1 = R11+R21 –R31
If
k1 < 0, k1 = 0
Set
k2= R12+R22-R32
If
k2 < 0, k2 = 0
Second, calculate a, b and c using random duplication:
300
a = R11 x (R22-R21)/100
b = (R12-R11) x (R21)/100
c = (R12-R11) x (R22-R21)/100
Set k3 = a + b + c
Third, factor a, c and d to the combined schedule analysis results:
400
Set k4 = (k2-k1)/k3
a = a x k4
b = b x k4
c = c x k4
Fourth, Derive d, e, f and g
500
d = R11- k1 – a
e = R12 – R11 – b – c
f = R21 – k1 – b
g = R22 – R21 – a - c
The final task is to calculate the combined reach, which consists of these elements:
-
those reached by Component 1 in periods 1 or 2 – Call this X1
those reached by Component 2 in period 1, who were not reached by Component 1
in either period – X2
X1 = R12
X2 = f
Combined Reach is therefore R12 + f.
Page 11 of 13
In this example, we get the following results:
Component 1
Period 1
45
15
40
Period 2, not Period 1
Neither Period
Period 1
Component 2
Period 2, not Period
1
Neither Period
40
25.0
7.5
7.5
10
5.6
1.9
2.5
50
14.4
5.6
30.0
The final schedule reach is then 60 + 7.5 = 67.5.
2.3
Generating the frequency distribution at the end of the schedule
The frequency distribution of the schedule can be calculated by fitting the
Negative Binomial Distribution to the schedule with the final reach and GRPs as
inputs for parameter calculation. Note:
i)
ii)
iii)
There is no need to adjust the parameters since the results are final.
The distribution is achieved by setting t = 1.
The NBD is only valid in multi-component schedule analyses for the
complete schedule. It cannot be used for reach build over time during
the period of schedule, because of the different time periods of the
components.
The results that feed into parameter calculation are:
GRP Total= GRP of R12 + GRP of R21
Reach Final= R12 + f
2.4
Three or more Schedules
The process can be applied to three or more schedules in an iterative way. The
two shortest schedules should be combined first, and then considered to be a
single schedule, with the combined reach calculated as described above. The
third schedule is then combined with this combined schedule in a similar way,
with the period 1 inputs being the results for the first two schedules. The overlaps
in Periods 1 and 2 between Schedules (1 and 2) and 3 are estimated by
factoring.
Example using the Period 1 and 2 schedules above, and a third schedule:
We have these basic components: the 70 for Components 1/2 is R32 in the
example above, and is the reach for these components over four weeks.
Component
Duration
Reach
1/2
3
1/2/3
4 Weeks
6 Weeks
4 Weeks
6 Weeks
4 Weeks
6 Weeks
70.0
75.0
35.0
40.0
80.0
84.0
Page 12 of 13
To take into account the fact that Component 2 is only two weeks and the reach
of Components 1 and 2 over the 4 weeks is 67.5 (as shown above), we factor the
results as follows:
1/2
Component
Duration
Reach
Factored
3
1/2/3
4 Weeks
6 Weeks
4 Weeks
6 Weeks
70.0
75.0
35.0
40.0
= 67.5
= (67.5/70) X
75
35.0
40.0
4 Weeks
6 Weeks
80.0
84.0
=84 X
=80 X
(67.5+35)/ (67.5+40)/
(70+35)
(70+40)
This then gives us the following as inputs into the procedure as follows, with the
Reference Codes related to the procedure as described above.
Component
Duration
1/2
3
1/2/3
4 Weeks
6 Weeks
4 Weeks
6 Weeks
70.0
75.0
35.0
40.0
Factored
= 67.5
= (67.5/70) X
75
35.0
40.0
Results
67.5
72.3
35.0
40.0
78.1
82.0
Reference
R21
R22
R11
R12
R31
R32
Reach
4 Weeks
6 Weeks
80.0
84.0
=84 X
=80 X
(67.5+35)/ (67.5+40)/
(70+35)
(70+40)
Using the method above, this then delivers a total reach for the three
components over the six weeks of 79.4.
Page 13 of 13
Download