a 'paid-up'

advertisement
CNI Spring 2012 Membership Meeting
Baltimore, April 1-3, 2012
Total Cost of Preservation
Cost Modeling for Sustainable Services
Stephen Abrams
Patricia Cruse
John Kunze
University of California Curation Center
California Digital Library
Outline

Goals

Prior work

Modeling preservation activity

Total cost of preservation


Source: Getty Images
Pay-as-you-go price model
Paid-up price model

Conclusions

Questions and discussion
http://wiki.ucop.edu/display/Curation/Cost+Modeling
Goals

Understand costs in order to plan for and implement
sustainable preservation services

Investigate the possibility of paid-up pricing in order
to address


Boom-or-bust budget cycles
Fixed-term, grant funded projects
Source: www.sharedidiz.com/
Prior work

Nationaal Archief (2005)
}
}
Identification of granular cost
components
http://www.nationaalarchief.nl/sites/default/files/docs/kennisbank/codpv1.pdf

LIFE (2008)
http://www.life.ac.uk/

KRDS (2010)
http://www.beagrie.com/krds.php

DataSpace (2010)
Assumption of annual decrease in
aggregate cost, i.e., discounted cash
flow (DCF)
http://arks.princeton.edu/ark:/88435/dsp01w6634361k

Jean-Daniel Zeller (2010)
“Cost of digital archiving: Is there a universal model?”
8th European Conference on Digital Archiving, Geneva, April 28-30, 2010
http://regarddejanus.files.wordpress.com/2010/05/costsdigitalarchiving-_jdz_eca2010.pdf

Rosenthal (2011)
Critique of DCF approach
http://blog.dshr.org/2011/09/modeling-economics-of-long-term-storage.html
Key assumptions

Consider only the costs incurred by the preservation
service provider


Costs of content creation by collection managers are out
of scope
Costs can be categorized unambiguously as fixed or
marginal, and one-time or recurring

One-time costs can be annualized over the effective
lifespan of the activity or system component
Cost model components

System, composed of various
 Services for necessary/desirable functions, running on
 Servers, deployed by
 Staff, in support of content
 Producers, who use
 Workflows to submit instances of
 Content Types, which occupy
 Storage, and are subject to ongoing
 Monitoring and periodic
 Interventions
Total cost of preservation
TCP  A  n  P  m W    C  k  S  j  M  i V
Total cost to
service
provider
Fixed cost of
System
Number and
unit cost of
Interventions
System component
subsumes Services
and Servers
Number and
unit cost of
Producers
Number and
unit cost of
Workflows
Staff costs are
subsumed by other
components
Unit cost and
number of
Content Types
Number and
unit cost of
Monitoring
Number and
unit cost of
Storage
Total cost of preservation
TCP  A  n  P  m W    C  k  S  j  M  i V
 Model is rich enough to represent the full economic cost of
preservation
 Implemented by a spreadsheet that captures all subsidiary
costs
Total cost of preservation
TCP  A  n  P  m W    C  k  S  j  M  i V
 Model is rich enough to represent the full economic cost or
preservation
 But service providers can customize the model to exclude
components whose costs are not recoverable or are subsidized
as a matter of local policy
Assumption: Cost allocation

Cost of the Archive, Workflows, Content Types,
Monitoring, and Interventions are “common goods”


Equally beneficial to all Providers
Properly apportioned across all Providers
Cost of a single Producer
A  m W    C  j  M  i V
G
 P  kP  S
n
Total cost attributable
to a given Producer
Number of
Producers
Unit cost of
a Producer
Number of Storage
units attributable to
Producer
Assumptions: Billing

Costs are billed for at the end of the period of
service

The cost model should be revenue neutral
Pay-as-you-go cash flow
Pay-as-you-go
price for a single
Producer
G
G
G
Income
Cash flow
diagram
t=0
1
2
3
Expense
G
G
Cost of a single
Producer
T 1
G (T )   G  G  T
t 0
Cumulative pay-as-you-go price over time period T
G
Cumulative pay-as-you-go price
as a function of time T
Cumulative pay-as-you-go
$16,000
G (T )
Cost ($)
$14,000
$12,000
$10,000
$ 8,000
$ 6,000
$ 4,000
$ 2,000
$
0
0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30
Year (T)
T 1
G (T )   G  G  T
G ()  
Cumulative pay-as-you-go price over time period T
… for “forever”
t 0
Assumptions: Costs over time

The aggregate cost of providing preservation service
decreases over time; and that decrease is uniform

Moore’s and Kryder’s laws
$
$
Moore’s law, 1971 – 2011
Kryder’s law, 1980 – 2012
Source: Wikipedia
Source: Wikipedia
Assumptions: Costs over time

The aggregate cost of providing preservation service
decreases over time; and that decrease is uniform



Moore’s and Kryder’s laws
State-of-the-art tools and understanding
Productivity increases
Discounted pay-as-you-go cash flow
Pay-as-you-go
price for a single
Producer
G
(1–d )·G
(1–d )2·G
Income
Discounted
cash flow
(DCF) diagram
t=0
1
2
3
Expense
G
(1–d )·G
Cost of a single
Producer
Discounting
factor
T 1
G (T , d )   G  (1  d ) t
t 0
Discounted pay-as-you-go price over time period T
(1–d )2·G
Discounted pay-as-you-go price
as a function of time T
(1-d)t discount
factor
$16,000
Cumulative pay-as-you-go
G (T )
Cost ($)
$14,000
Discounted pay-as-you-go
G (T,d )
$12,000
Discounted pay-as-you-go
G (,d )
$10,000
$ 8,000
$ 6,000
$ 4,000
$ 2,000
$
0
0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30
Year (T)
G(T , d )  G 
11 d T
d
Discounted pay-as-you-go price over time period T
G
G ( , d ) 
d
… for “forever”
Discount factor

d is the weighted sum of the expected changes in
number and unit cost of individual components
d   A  d A  W  (d m  dW )  C  (d   dC ) 
M  (d j  d M )  V  (d i  dV )   P  d P  S  (d k  d S )

Weighting factors ω are the proportion that a
particular component contributes to the aggregate
cost G, e.g.
A
A 
nG
m W
W 
n G
Drawbacks to pay-as-you-go pricing

Only viable for Producers with reliable annual
funding sources

Boom-or-bust budgeting or the termination of
funded project work can interrupt this funding

Any interruption in proactive preservation care can
lead to irretrievable data loss
Assumptions: Investment return

Preservation service providers can carry forward
budgetary surpluses across fiscal years

Surplus funds can be invested with the return
supplementing the surplus
Paid-up cash flow
Paid-up price
r ·[(1+r )·[(1+r) ·F–
r ·[(1+r )·
G ]–(1–d )·G ]–
(1–d )2·G
F –G ]
Investment
return
F
r ·F
Income
t=0
1
Cost of a single
Expense
Producer
Surplus
F
2
G
(1–d )·G
(1+r )·F –G
T 1
F (T , d , r )   G 
t 0
1  d 
Paid-up price for time period T
(1–d )2·G
(1+r )·[(1+r )·
(1+r )· [(1+r )·
[(1+r )· F –G ]–
F –G ]–
(1–d )·G ]–(1–d )2·G
(1–d )·G
t
(1  r )
3
t 1
Paid-up price
as a function of time T
(1–d)t discount
factor
$16,000
Cumulative pay-as-you-go
G (T )
Cost ($)
$14,000
Discounted pay-as-you-go
G (T,d )
$12,000
Discounted pay-as-you-go
G (,d )
$10,000
$ 8,000
Paid-up price, for T
F (T,d ,r)
$ 6,000
(1+r)t investment
return
$ 4,000
Paid-up price, for 
F (,d ,r)
$ 2,000
$
0
0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30
Year (T)
F (T , d , r )  G 
(1 r )T (1 d )T
(1 r )T r  d
Paid-up price for time period T
G
F (, d , r ) 
rd
… for “forever”
Paid-up example
Year
Income
Expense
Surplus
0
$ 4,725.00
–
$ 4,725.00
1
$
94.32
$ 650.00
$ 4.285.32
2
$
83.39
$ 617.50
$ 3,764.21
3
$
72.70
$ 586.63
$ 3,262.29
4
$
62.43
$ 557.29
$ 2,778.42
5
$
52.53
$ 529.43
$ 2,311.52
6
$
42.99
$ 502.96
$ 1,859.55
7
$
33.79
$ 477.81
$ 1,422.53
8
$
24.91
$ 453.92
$ 816.52
9
$
16.33
$ 431.22
$ 401.63
10
$
8.03
$ 409.66
$





Pay-as-you-go price, G
Discount factor, d
Investment return, r
Term, T
Paid-up price, F
0.00
$ 650 (1 TB)
5%
2%
r
d
10 years
$ 4,725 < $ 5,216 < $ 6,500
Coefficient of permanence

It is useful to be able to transition from a pay-asyou-go to a paid-up price basis

If you’re currently paying G on a pay-as-you-go
basis, you can upgrade to a paid-up basis with a
1
one-time payment of F = G ·φ , where  
rd
 Princeton DataSpace, φ ≈ 30 (T = )

USC digital repository, φ ≈ 1.2 (T = 20)
Problems with R&D

TCP modeling is dependent on the predicative
reliability of r and d

For d, extrapolate from Moore’s and Kryder’s laws?
?
Moore’s law, 1971 – 2011
Kryder’s law, 1980 – 2012
Source: Wikipedia
Source: Wikipedia
?
Problems with R&D

TCP modeling is dependent on the predicative
reliability of r and d


For d, extrapolate from Moore’s and Kryder’s laws?
For r, extrapolate from 30 year Treasury bonds?
?
30 year treasuries, 2007 – 2012
30 year treasuries, 1882 – 2012
Source: http://ycharts.com/indicators/30_year_treasury_rate
Source: Robert Schiller
Model the risk

Round up r and d, i.e., adding a fixed “risk
premium”

Add an additional risk component R to the formula
for G
A  m W    C  j  M  i V
G
 P  kP  S + R
n

Its influence on the price can grow over time, reflecting
increasing uncertainty, by setting a negative discount
factor dR so that 1–dR > 1

Note, however, that if the weighted sum d becomes less
than 0 and |d | > r then G (T ) will not converge to a limit
Recalibrate the model

G and F do not have to be fixed values over time

Periodically recalculate based on current conditions
(actual costs for G ) and predictions (r and d ), and apply
prospectively

Retrospective service contracts remain “locked-in”
Hybrid price model

Distinguish between costs that are (relatively) easy
to quantify and forecast, and those that aren’t

Use the paid-up model for the former and pay-as-you-go
for the latter
Easy
Difficult
Archive
Intervention
Producer
Workflow
Content Type
Monitoring
Storage
Hybrid price model

Distinguish between costs that are (relatively) easy
to quantify and forecast, and those that aren’t

Use the paid-up model for the former and pay-as-you-go
for the latter
Easy
Difficult
Archive
Content Type
Producer
Workflow
Storage
Monitoring
Intervention

Bit preservation only
Preservation forever

Some things are intended to last forever…
Source: John Church Company
Source: United Artists
Preservation forever

Some things are intended to last forever…
?
Preservation for …

A fixed term – 10 years? 20 years? – may be
appropriate for much content

Give content an opportunity to prove its worth, as
evidenced by someone’s commitment to pay for its
subsequent preservation
Embrace uncertainty

The discounted cash flow (DCF) approach is
problematic on practical and theoretical grounds


Difficulty in the setting fixed values for r and d that
realistically represent financial and technological trends
over time
Stochastic modeling to determine the probability
distribution of possible outcomes

C.f., David Rosenthal, FAST ‘12
http://blog.dshr.org/2012/02/fast-2012.html
CNI Fall 2011
http://www.youtube.com/watch?v=_5lQxmyz3xY
Understand the risks

Possible outcomes…

We overestimate our costs and collect too much
●
●

Fund a higher level of service
Refund some portion
We underestimate
●
●
●
Ask for additional funds
Lower service levels
De-accession content – but at least it was preserved up to that
point and had a chance to prove its value, and gain an advocate
Conclusions

Different customers have different funding
capabilities


Flexibility in price models is important
Any price model is based on an idealization of the
real world

Assumptions matter

Knowing all of the costs is distinct from a policy
decision to recover all of those costs

If investment return and discount factor can be
reliably projected, then standard DCF/NPV methods
provide a reasonable prediction of long-term costs
Looking for feedback

Thanks to our reviewers











Lisa Baird, UCOP
Raym Crow, SPARC
Todd Grappone, UCLA
Cliff Lynch, CNI
David Minor, UCSD/SDSC
Richard Moore, SDSC
Michael Mundrane, UC Berkeley
Jake Nadal, UCLA
David Rosenthal, LOCKSS
Mackenzie Smith
UC Council of University Librarians /
Systemwide Operations and Planning Group
 Have we missed something
in our analysis, logic,
assumptions, math, etc?
 Are the objections to the
DCF-based analysis
substantial enough to
invalidate this approach?
 Are there better forecasting
methodologies?
 Candace Yano, UC Berkeley
 Even if we don’t have a perfect model, we need to move
forward now with a “good enough” model
For more information
Total Cost of Preservation: Cost Modeling for Sustainable Services
http://wiki.ucop.edu/display/Curation/Cost+Modeling
UC Curation Center
http://www.cdlib.org/uc3
uc3@ucop.edu
Stephen Abrams
Patricia Cruse
Scott Fisher
Erik Hetzner
Greg Janée
John Kunze
Margaret Low
David Loy
Mark Reyes
Abhishek Salve
Joan Starr
Tracy Seneca
Carly Strasser
Marisa Strong
Adrian Turner
Perry Willett
Download