2.2 - United Nations Statistics Division

advertisement
Weighting issues
Julian Chow
Industrial and Energy Statistics Section
United Nations Statistics Division (UNSD)
Email: chowj@un.org
Overview –
st
1
Session
1st Session – Weighting issues
 The role of weights in an index
 Theory - weights in the Laspreyres
formula
 Determining IIP weights in practice
 Weight updating
 Fixed weight index vs. chained index
2
Overview –
nd
2
Session
2nd Session – Missing weights
 Missing weights for the most recent
periods
 Missing weights for the entire time span
of one component series
 Discussion
3
Question?
How can the change of
the production level of
coca-cola be reflected
in IIP?
Example
 Elementary observation (quantity/value produced by an establishment)
 Coca-cola
 Product
 Waters, with added sugar, other sweetening matter or flavoured, i.e. soft drinks
 Product group
 Other non-alcoholic caloric beverages (CPC Ver.2 Sub-class 24990)
 4-digit-Industry
 Manufacture of soft drinks; production of mineral waters and other bottled waters
(ISIC Rev.4 Class 1104)
 3-digit industry
 Manufacture of beverages (ISIC Rev.4 Group 110)
 2-digit industry
 Manufacturing of beverages (ISIC Rev.4 Division 11)
 1-digit industry
 Manufacturing (ISIC Rev.4 Section C.)
5
IIP Structure
Stage 3: Weights for
industry branches –
Gross value added at
basic prices
Total IIP
1-digit ISIC
2-digit ISIC
3-digit ISIC
4-digit ISIC
Stage 2: Product
group weights –
Value of output
obtained via
census/survey
Stage 1: Product
weights –
Value of output
obtained via
census/survey
Product groups assigned to one 4 digit ISIC branch
Individual sampled products assigned to one product group
6
The role of weights in the index
 Weights are used to aggregate series into higher level
aggregates
 Can be done at different levels
 Weights have to be chosen accordingly
 Weights have to reflect the relative importance of the
individual components within the aggregate
 Weights determine the impact that a particular volume
change will have on the overall index
7
More about weights
 Over time, establishment production levels shift in
response to economic conditions.
 Relative importance may change
 Products within a product group
 Product groups within an industry
 Lower level industries within higher level aggregates
 For the IIP to reflect the movements as good as possible,
the weights have to reflect these changes
8
Recap the Laspreyres volume
index
Notations used
Notations
 pt : prices at time t
 qt : quantities at time t
 Base period: t=0
 i: units (i.e. products, product groups or
industries) to be aggregated
 n: number of units (i.e. products, product
groups or industries) to be aggregated.
10
Laspeyres volume index
Fixed prices from the base period
How much more would the value of basket be in
the current period if the price in the current period
is the same in the base period?"
Value of the basket in the current period
using base period price
Value of the basket in the base period
n

0 t
p
 i qi
i 1
n
0 0
p
 i qi
 100
i 1
11
Laspeyres volume index –
another form
Volume index formulae may be rewritten so that indices may
be constructed using values instead of prices
n
0:t
I Laspeyres

where w 
0
i
0 t
p
 i qi
i 1
n
0
p
 i qi
i 1
0 0
i i
p q
n
qit 0
 100   0 wi  100
0
i 1 qi
n
0 0
p
 i qi
The value share (weight) at period 0
prices and quantities for unit i
i 1
12
Weights in Laspreyres formula
Price weights


n
pi0
i 1

 n
100   n

0 0
0 0
i 1
p
q
p



i
i
i qi
i 1
 i 1
n
p
0
i
0:t
I Laspeyres
qit


q t 100
 i


quantities
Value weights



 t
0 0
n
q 
p
q
 in1
100    n i i  i0  100


0 0
0 0  qi 
i 1
pi qi
  pi qi 

i 1
 i 1

n
0:t
I Laspeyres
 pi0 qit
Quantity
relatives
13
Determining IIP weighting data
in practice
Example
 Elementary observation (quantity/value produced by an establishment)
 Coca-cola
 Product
 Waters, with added sugar, other sweetening matter or flavoured, i.e. soft drinks
 Product group
 Other non-alcoholic caloric beverages (CPC Ver.2 Sub-class 24990)
 4-digit-Industry
 Manufacture of soft drinks; production of mineral waters and other bottled waters
(ISIC Rev.4 Class 1104)
 3-digit industry
 Manufacture of beverages (ISIC Rev.4 Group 110)
 2-digit industry
 Manufacturing of beverages (ISIC Rev.4 Division 11)
 1-digit industry
 Manufacturing (ISIC Rev.4 Section C.)
15
Practical steps for selecting and
determining weights
 Determine sampling weights at the
establishment and product level
 Determine weights for individual sample
products
 Determine weights for the product groups
 Determine weights for the industry groups
16
Sampling in the IIP: Example
 A random sample of establishments
 Initially from the business register as part of the product survey
goods (e.g. PRODCOM) (e.g. 25,000 establishments in total)
 Subset of these sampled for the IIP (e.g. 7,000 establishment
sampled)
 A random sample of products from sampled establishment
 Again, using information provided in the product survey (e.g
Results in 9,000 product-establishment pairs)
 A purposive sample of elementary observation from the sampled
product-contributor pairs
 Undertaken using judgement of respondent but scrutinised by
subject expert
17
Sampling weights at the
establishment and product level
 The associated weights at the establishment and product level to obtain the
value/output of a particular products depend on the sampling scheme
 If probability sampling techniques are used, the inverse of the sampling
fractions are used as the weights
 We are not going to discuss sampling weights in details since this is a topic of
survey sampling
 This leaves us with three fundamental level of weights in the IIP compilation
 Product weights
 Product group weights
 Weights for industry branches
18
IIP Structure
Stage 3: Weights for
industry branches –
Gross value added at
basic prices
Total IIP
1-digit ISIC
2-digit ISIC
3-digit ISIC
4-digit ISIC
Stage 2: Product
group weights –
Value of output
obtained via
census/survey
Stage 1: Product
weights –
Value of output
obtained via
census/survey
Product groups assigned to one 4 digit ISIC branch
Individual sampled products assigned to one product group
19
Product weights
Reflect the relative importance of a particular
product in the product group
 E.g relative importance of coca-cola in the product ‘soft drinks’
Share of value of output should be used to weight
each product in the product group.
The product weights are generally obtained via the
conduct of product censuses or surveys.
20
Product weights
Product sales, though, are sometimes used in lieu
of value of output as a weighting variable at this
level of the index structure.
Value of output
- work-in-progress
- output produced this period entered into inventory
+ inventory produced in the past sold in this period
= product sales (value of output sold)
21
IIP Structure
Stage 3: Weights for
industry branches –
Gross value added at
basic prices
Total IIP
1-digit ISIC
2-digit ISIC
3-digit ISIC
4-digit ISIC
Stage 2: Product
group weights –
Value of output
obtained via
census/survey
Stage 1: Product
weights –
Value of output
obtained via
census/survey
Product groups assigned to one 4 digit ISIC branch
Individual sampled products assigned to one product group
22
Product group weights
Share of value of output (or proxies thereof) by
product group within its ISIC class
These “values of output” allow product groups to
be weighted together (combined) and reflect the
relative importance of each product group within
an ISIC class.
 E.g relative importance of soft drinks in ‘Other non-alcoholic
beverages (CPC Ver.2 24990)
23
Product group weights
Each product group is assigned to just one
ISIC 4-digit industry.
Sources
 The product group weights are generally
obtained via the conduct of product
censuses or surveys.
24
IIP Structure
Stage 3: Weights for
industry branches –
Gross value added at
basic prices
Total IIP
1-digit ISIC
2-digit ISIC
3-digit ISIC
4-digit ISIC
Stage 2: Product
group weights –
Value of output
obtained via
census/survey
Stage 1: Product
weights –
Value of output
obtained via
census/survey
Product groups assigned to one 4 digit ISIC branch
Individual sampled products assigned to one product group
25
Industry weights
Share of gross value added (GVA) at basic prices
by industry of all industries in-scope of industrial
production.
GVA at basic prices
=Value of output
– intermediate consumption
+ subsidy receivable on products
– tax payable on products
26
Industry weights
Using value of output as weight is not
suitable
 Introduce distortion by giving a higher
weight to any industry using intermediary
goods and services
 Double count intermediary goods and
services in the final aggregate
27
Industry weights
 GVA vs NVA
 Net value added (NVA)
= Gross value added (GVA)
– consumption of fixed capital (depreciation)
 Why select GVA, not NVA?
 Measure of consumption of fixed capital is quite difficult to
observe
 GVA refers more to supply side considerations to meet final
demand, including gross capital formation.
 Whereas NVA is more meaningful for an income approach in
measure welfare and living standards
28
Industry weights
GVA should be used as weights starting from the
4-digit level of ISIC
Sources
 Such information is available as a result of
annual national accounts compilation.
 However, for some countries, it requires the use
of other comprehensive data sources, such as
industry survey or economic census to obtain
weights for lower levels of ISIC.
29
Summary so far
Stage 3: Weights for
industry branches –
Gross value added at
basic prices
Total IIP
1-digit ISIC
2-digit ISIC
3-digit ISIC
4-digit ISIC
Stage 2: Product
group weights –
Value of output
obtained via
census/survey
Stage 1: Product
weights –
Value of output
obtained via
census/survey
Product groups assigned to one 4 digit ISIC branch
Individual sampled products assigned to one product group
30
Calculating weights
Weights formula
wi0 
Vi 0
n
0
V
 i
i 1
By consequence
n
0
w
 i 1
i 1
 V : Absolute weight
(value)
 w : Relative weight
 Base period: t=0
 i: products, product
groups or industries to
be aggregated
 n: Set of all products,
product groups or
industries to be
aggregated.
31
Example
Suppose the product group “Other non-alcoholic caloric beverages
(CPC Ver.2 24990) contains the following product with
 Soft drinks (output value =70)
 Non-alcoholic beverages not containing (output value =20)
 Non-alcoholic beverages containing milk fat (output value =10)
Product weights within the product group are
 Soft drinks [weights = 70/(70+20+10)=0.7]
 Non-alcoholic beverages not containing [weights = 20/(70+20+10)=0.2]
 Non-alcoholic beverages containing milk fat
[weights = 10/(70+20+10)=0.1]
32
Weight updating
Why updating the weights?
Reflect changing structure in the economy
Over time production level shifts in response to
economic situations
Example
Smart phone
Typewritters
34
Key issues to consider when
updating index weights
The frequency of weight updates
The method used to incorporate new
weights into index structure
35
Update frequency
Update frequency of IIP weights can be linked to
 The need to accurately reflect the current
relative importance of product groups and
industries
 Data availability
 The index type used to compile the index
• Laspreyres-type index provide some flexibility regarding
update frequency as weights are not derived from the current
period
36
Update frequency - recommendation
 Industry weights
• Annual
• The latest weights available are likely from t-2 or t-3
• Frequent update of weights can alleviate the substitution
bias/changing weights problem
 Product group weights
• at least every 5 years
• Less frequent than those for industry level due to resource and data
constraints
 Product group
• The weights of individual products are updated at the same time as
product group
37
How to select reference period?
Concepts of reference period
 Quantity reference period
 the period whose volumes appear in the denominators
of the volume relatives used to calculate the index
 Weight reference period
 The period, usually a year, whose values serve as
weights for the index
 the index reference period
 The period for which the index is set equal to 100.
 The three types of base periods may coincide, but
frequently do not.
39
Weight reference period
Laspeyres-type volume index with weights
updated annually
 The weight reference period will always
be the most recent period (year) for
which weights are available
40
Weight reference period
In circumstances of less frequent weight updates, the
weight reference period should therefore possess
the following characteristics:
(a) Reasonably normal/stable (i.e. typical of recent
and likely future years);
(b) not too distant from the reference period;
(c) clearly identified when analyzing and comparing
the index results.
41
Summary
 Industry level weights
• Annual update should be carried out.
• Should ideally be National Accounts value added figures at
basic prices – adjustments necessary to make them timely
available.
 Product group weights
• Should be updated frequently at least every 5 years
• Obtained by determining the share of value of output, via the
conduct of product census or surveys
 Product weights
• The weights of individual products are updated at the same
time as product group
• Obtained by determining the share of value of output, via the
conduct of product census or surveys
42
Fixed weights vs chained index
- Concepts
Fixed base volume index
Hold one period as the base period and compare
all prices back to this period
Calculate movement back to the base period for
each successive time point
Each index in the time series is a comparison from
that period back to the base period
44
Fixed base volume index
Fixed base volume index, from time 0 to 4
TABLE 2.12- PRICES AND QUANTITIES FOR SIX COMMODITIES, WITH DIRECT LASPEYRES
VOLUME INDICES.
Value of basket (£’s) fixed in period 0 prices
Commodity
A Agricultural commodity
B Energy
C Traditional manufacture
D High-tech goods
E Traditional services
F High-tech services
Total
qAt pA0
qBt
qCt
q Dt
q Et
qFt
Direct Laspeyres volume index
(change from period 0)





pB0
pC0
p D0
p E0
pF0
0
1
2
3
4
1.00
1.20
1.00
0.80
1.00
1.00
3.00
1.00
0.50
1.00
2.00
2.60
3.00
3.20
3.20
1.00
0.70
0.50
0.30
0.10
4.50
6.30
7.65
8.55
9.00
0.50
10.00
0.40
14.20
0.30
13.45
0.20
13.55
0.10
14.40
100.0
142.0
134.5
135.5
144.0
45
Fixed base volume index
Direct (Fixed Base) Index
150
Index (0=100.0)
140
130
120
110
100
90
0
1
2
3
4
Period
46
Fixed base volume index
Direct (Fixed Base) Index
150
Index (0=100.0)
140
130
120
110
100
90
0
1
2
3
4
Period
47
Chained volume index
 Calculate consecutive period volume index:
 Use a period 0 basket to look at period 0 to 1 changes
 Use a period 1 basket to look at period 1 to 2 changes
 Use a period 2 basket to look at period 2 to 3 changes
 Use a period 3 basket to look at period 3 to 4 changes
 Chain these results together to get a measure of price change from 0 to
4
48
Chained volume index
Consecutive period indices
150
140
Index (0=100.0)
130
120
110
100
90
80
0
1
2
3
4
Period
49
Chained volume index
CALCULATION OF INDIRECT (CHAINED) LASPEYRES VOLUME INDICES AND COMPARISON WITH DIRECT (FIXED
BASE) LASPEYRES VOLUME INDICES
Period
Index
Direct volume index, period 0 to 1
Direct volume index, period 1 to 2
Indirect (Chained) volume index, periods 0 to 2
Direct volume index, period 2 to 3
Indirect (Chained) volume index, periods 0 to 3
Direct volume index, period 3 to 4
Indirect (Chained) volume index, periods 0 to 4
Direct (Fixed base, period 0) volume index,
periods 0 to 4
17 March 2010
ECLAC, Santiago
0
1
100.0
142.0
100.0
142.0
100.0
-
2
96.1
136.5
100.0
136.5
-
3
-
100.0
142.0
136.5
97.8
133.5
100.0
133.5
100.0
142.0
134.5
135.5
100.0
-
142.0
-
-
Workshop on Manufacturing Statistics
for ECLAC member states
4
99.7
133.1
144.0
Slide 50 50
of 89
Chained volume index
Consecutive period indices
150
140
Index (0=100.0)
130
120
110
100
90
80
0
1
2
3
4
Period
51
Chained volume index
 Chaining!
Indirect (Chained) Index
150
140
Index (0=100.0)
130
120
110
100
90
80
0
1
2
3
4
Period
52
Comparison
Comparison between Fixed basket and Chained Indices
150
140
Index (0=100.0)
130
120
110
100
90
80
0
1
2
3
4
Period
 Different result
 special case of equality is called a transitive index formula
• fixed baskets with differential weights never transitive
53
Fixed base vs chain index
Fixed base result more attractive
operationally
 Only one revaluing step
 One set of prices (weights at base period)
Why would we chain?
 Updating the basket and weights!
54
Fixed weights vs chained index
- Recommendations
Fixed weights vs chained index –
on weights update
 Fixed weight indices
 Weight structure fixed at particular point
 Compare volume in period t relative to some fixed base period
 When base year change, entire historical series are revised as value
for all periods are recalculated using the new base weights
 Chain-linked indices
 Updating of weights and linking two index together to produce a
time series
 Unlike the fixed weight approach, the chain approach does not recalculate the entire historical series
 Therefore, the index is compiled for a succession of different
segments while keeping the original weights for each past segment
fixed
56
Old recommendations
 Use fixed weights for the calculation
 Update weights every 5 years
 Recalculate entire series
 Problem:
 New weights may reflect better the movements in the
current periods, but they are not applicable for past data
(far from new weight period)
• Problem simply shifts to a different period
57
New recommendations
Update weights more frequently
 Recommended: Annually
Do not re-calculate entire series
Use chain linking to produce time
series for IIP
58
New recommendations
Chain-linking annually rebased series
allows for better reflection of current
economic structure in the weights in each of
the sub-series
 Current period and weight base period are
not too far apart
 Alleviate substitution bias
 Provide opportunity to incorporate new
products
59
Linking
 How to link the individual sub-series to obtain longer time
series?
 A linking factor has to be determined to link the new series
to the existing historical series
 This factor is then applied to the new (old) series to
convert it to the old (new) base year
60
Linking
The long-term time series are
calculated from a succession of shortterm series with updated weights
• Note: Short-term series can span
any number of periods
61
Linking options
 Annual overlap, linking factor based on
 annual index for years t
 index of the same year using weights of year t-1
 One-quarter overlap, linking factor based on
 index of the first quarter of year t
 Index of the same quarter using weights of year t-1
 Over-the-year technique
 Linking factor based on same periods for years t and t-1
62
63
Recommended method
 Annual overlap technique
 More practical for Laspeyres-type volume measures
 Monthly/quarterly data aggregate to annual data
• However, there are no clear established rules for
choosing this approach
• In most cases, the approaches will give similar
results
64
Drawback of chainlinking
Lacks of additivity characteristic
 The lower level volume measures (e.g. ISIC 4digit class) do not sum to upper levels of the
ISIC structures (e.g. 3 digit ISIC level)
When individual prices and quantities changes
occurring in earlier periods are reverse in later
period, chaining can lead to a worse result than a
fixed base index
65
Summary
 Industry level weights
• Annual update should be carried out.
• Should ideally be National Accounts value added figures at
basic prices – adjustments necessary to make them timely
available.
 Product group weights
• Should be updated frequently at least every 5 years
• Obtained by determining the share of value of output, via the
conduct of product census or surveys
 Product weights
• The weights of individual products are updated at the same
time as product group
• Obtained by determining the share of value of output, via the
conduct of product census or surveys
66
Summary
The chained Laspreyres-type volume index is the
recommended one for the compilation of the IIP
When re-weighting occur
 Do not re-calculate the entire series
 The index is compiled with weights only for
those period to which they relate
For monthly and quarterly data, advantage of
chaining are less as price and quantity are subject
to greater fluctuation.
67
Conceptual illustration of IIP annual chaining
120
115
110
Index number
105
100
95
90
85
80
Jan07
Feb07
Mar07
Apr07
May07
Jun07
Jul07
Aug07
Sep07
Oct07
Nov07
Dec07
Jan08
Feb08
Mar08
Apr08
May08
Jun08
Jul08
Aug08
Sep08
Oct08
Nov08
Dec08
Jan09
Month
68
Feb09
Missing weights
Missing weights
Missing weights for the most recent periods
Missing weights for the entire time span of
one component series
Notice that there is no ‘recommended’
approach in this area.
70
Estimation of the missing weights for
the most recent period
In practice, the calculation of the IIP is likely to
use industry weights from period t-2
 i.e. the year 2005 index is likely to be compiled
using industry weights from 2003.
This is because the necessary weighting data for
the industry level are not normally available until
at least 18 months after the reference period.
71
Missing weights for the most recent
period
 In some countries, for the first few months of a new year
(i.e. year 2006 in this example)
 the index may need to be compiled using the ‘old’
weights (i.e. from 2003) because the ‘new’ weights (i.e.
from 2004) are not yet available.
 In these situations, the IIP should be recalculated (revised)
on the basis of the new weights once they become
available
 i.e. the January 2006 IIP should be calculated using the
weights from 2003 but be recalculated on the basis of
2004 weights when they become available.
72
Estimation of the missing weights for
the most recent period - others
 Alternative source
 For example, use survey data (e.g. annual survey of
manufacturing) to impute for the missing GVA.
 Administrative sources
 Subjective expert judgement
 Estimation
 Time series method - ARIMA , state-space model, moving
average, etc.
 Regression model
 Imputation procedure.
 Use equal weights
 Need proper quality check!
73
More on estimations
Imputation
Regression
Exponential smoothing
ARIMA model
State space model
74
Imputation
 Historic value
 Use historic value such as last year value
 Historic value with trend
 Trend can be based on growth in another variable within the
record, variables in other records, etc.
 Useful method when variables or growth rates are stable
over time
75
Regression model
Yt   ' Xt  t



A regression model predicts a missing value using a
function of some auxiliary variables X.
Auxiliary variables can be from the current survey or
other sources. E.g. historical information (previous
period value)
Regression coefficients (beta) can be determined
from historic data
76
Exponential Smoothing

Y t 1|t



 Y t|t 1   Yt  Y t|t 1 


Forecast of t+1 value at
time t

Smoothing
parameter
Forecast error at
time t-1
 Smoothing parameter determined by
 Subjective consideration
 Minimizing sum of square of forecasting errors
 Relatively simple to use
77
Autoregressive integrated
moving average (ARIMA)
ARIMA(p,d,q) model (assume d=0 in this case)
 Identification of the model is necessary before
proceeding to forecasting.
 The AR and MA lag order (i.e. p and q)
 The AR and MA smoothing parameters (i.e. φ and θ)
 The integrated order, d
 Complicated to use, but many statistical software, such as
SAS and R, has a built-in procedure for estimation
78
State Space Model

State Space Model is a structural time series model that allows

Obtain unobserved component (unseen driving force) given observable
series (something you can see)

Model each time series component (trend, seasonal, and also sampling
error) within a structure

Update estimates at current time using ‘Kalman Filter’

Two equations in matrix form
–
–
Measurement (Observable) Equation
State (Transition) Equation

The magic is that once the system is specified in these two equations, the
system can be updated through a certain set of algorithm.

Complicated to use, but unlike ARIMA, it does not require the series to be
stationary. In addition, it will cope with multivariate approach for further
extension
State Space Model

Measurement Equation
Yt  Z t  I t

State Equation
 t  T t 1  R t






Notations
Y: A Observable Series (e.g. Weights)
α: State Vector (e.g. a vector of trend, sampling error)
Z, T, R – matrix for computations
I, η – Random Errors
Subscript represent time point
Missing weights for the entire time span
of component series
Use equal weights
Expert judgement
Use weights from other sources
Estimations
Product replacement
81
Summary
The calculation of the IIP is likely to use industry
weights from period t-2
If period t-2 weights is not available, the index
may be compiled using the t-3 weight
Several methods of estimating missing weights at
the most recent periods are also proposed in this
presentation, though there is no international
recommendation in this area.
82
Discussion
83
Download