Valuing Public Domain Images on Wikipedia and Why it Matters

advertisement
VALUING PUBLIC DOMAIN IMAGES ON
WIKIPEDIA AND WHY IT MATTERS
Paul J. Heald
Richard W. & Marie L. Corman
Research Professor
College of Law, University of Illinois
University of Glasgow, CREATe (RCUK
Centre for Copyright and New
Business Models in the Creative Economy)
PART I: THE PROBLEM AND WHY IT
MATTERS
Retroactive extension of the copyright term
 Nothing has fallen into the public domain in the
U.S. due to expiration since 1998. 1923!
 Justification? “Bad things happen when works
fall into the public domain.” [allegations of nonuse and over-use]
 Reality? The Problem of the Missing Works . . .
 Empirical evidence as relevant to the policy
debate.

PART II: VALUATION OF PUBLIC DOMAIN
WORKS
Measuring what the creative industries lose
when works don’t fall into the public domain and
disappear.
 Valuing public domain images on Wikipedia as a
positive example
 Possible application for thinking about valuation
in litigation and transactions
too . . .

2317 New Editions from Amazon
by Decade
400
350
300
Fiction &
Non-Fiction
Books
250
200
150
100
50
1800
1810
1820
1830
1840
1850
1860
1870
1880
1890
1900
1910
1920
1930
1940
1950
1960
1970
1980
1990
2000
0
0.3
Estimated Amazon Titles by Percent
Per Decade
0.25
0.2
0.15
Fiction &
Non-Fiction
Books
0.1
0.05
1800
1810
1820
1830
1840
1850
1860
1870
1880
1890
1900
1910
1920
1930
1940
1950
1960
1970
1980
1990
2000
0
0.25
0.2
0.15
Estimated Amazon Book Titles
Adjusted for Total Number of Books
Published Per Decade
WorldCat
Adjusted
CopyReg
Adjusted
0.1
0.05
1800
1810
1820
1830
1840
1850
1860
1870
1880
1890
1900
1910
1920
1930
1940
1950
1960
1970
1980
1990
2000
0
AVAILABILITY OF BESTSELLERS PUBLISHED
1913-22 (IN PD) AND 1923-32 (COPYRIGHTED)
Availability of Works
1
0.9
0.8
Percent available (1=100%)
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
Books Published 1907-22
Books Published 1923-32
DO EBOOKS SOLVE THE PROBLEM?
 In
2014, 94% of 165 public domain
bestsellers from 1913-32 were available
in eBook format, up from 48% in 2006.
 Of 167 bestsellers from 1923-32 still
under copyright, only 27% (45/167) had
been made available as eBooks by
publishers by 2014.
 And of those 45 copyrighted eBooks, only
one was out-of-print in hard copy
format.
 Market failure?!!
Percentage of 800 NYT Reviewed Books in eBook
Format by Decade
0.8
0.7
0.6
0.5
2014
eBooks
0.4
0.3
0.2
0.1
0
1930's 1940's 1950's 1960's 1970's 1980's 1990's 2000's
EVIDENCE OF DEMAND FOR MISSING
WORKS
 Smith,
Telang, and Zhang, “Analysis of
the Potential Market for Out-of-Print
eBooks,”
http://papers.ssrn.com/sol3/papers.cfm?ab
stract_id=2141422 (2012)
 Authors used matched pairs analysis to
estimate a $740 million eBook market for
out-of-print titles.
 Why do publishers seems to leave money
on the table?
EVIDENCE OF DEMAND FOR MISSING WORKS
Initial Publication Dates of New (Amazon) and
Used Books (Abe Books) for Sale 2012-2013
10000000
9000000
8000000
7000000
6000000
5000000
4000000
Used Books
New Books
3000000
2000000
1000000
2000's
1990's
1980's
1970's
1960's
1950's
1940's
1930's
1920's
1910's
1900's
1890's
1880's
1870's
1860's
1850's
1840's
1830's
1820's
1810'2
1800's
0
EXPLOITATION LEVELS OF AUDIO
BOOKS
What Is Available in Audio Book Version?
 33% public domain titles
 16% copyrighted titles
 80% of top 20 copyrighted titles
 100% of top 20 Public Domain titles
WHAT ELSE IS AT STAKE:
PENGUIN CLASSICS PRICING DATA
48 Copyrighted Books:
 Average price per book ($14.60)
 Average length (310 pages)
 Average price per page ($.047)
48 Public Domain Books:
--Average Price per book ($11.10)
--Average length (374 pages)
--Average price per page ($.03)
AUDIO BOOK PRICING DATA


for the top 20 PD titles,
(CD)
(MP3)
 Average price, per minute playing time
based on the lowest price version
at audible.com
For top 20 copyright titles
(CD)
(MP3)
DOES LACK OF OWNERSHIP CAUSE
OVERUSE?
 Copyright
owners
license compositions once
every 3.3 years.
 Public
domain compositions
used once every 3.8 years.
 No
evidence of over-grazing looking
at the songs as a group
OVER-USE?, CONT’D



The two most exploited PD songs are:

Danny Boy (9 movies from 1993-2001)

After You’ve Gone (9 movies, 1996-2006)
Copyrighted Songs in 1930’s

Sweet Georgia Brown (15 movies),

Am I Blue? (17 movies)

Happy Days Are Here Again (34 movies)
Copyrighted songs more recently;

Blues Skies (10 movies from 1994-2004)

Stardust (10 movies in the 1990’s)

Dream a Little Dream of Me (10 movies
from 1995-2005)
VALUING PUBLIC DOMAIN IMAGES ON
WIKIPEDIA AND WHY IT MATTERS
Paul J. Heald
Richard W. & Marie L. Corman
Research Professor
College of Law, University of Illinois
University of Glasgow, CREATe (RCUK
Centre for Copyright and New
Business Models in the Creative Economy)
WIKIPEDIA RESEARCH . . .
CALCULATING THE VALUE OF THE
PUBLIC DOMAIN
CONTEXT
 UK
Intellectual Property Office wants to
know! Debate over retroactive extension
of the copyright term.
 Evaluating the benefits of orphan works
legislation
 Exercise in valuation with applicability to
damage calculation in cases of image
infringement on the web, e.g. how much is
an infringer unjustly enriched by
appropriating an image.
 Prior
 How
research on the cost of © Protection:
to evaluate the positive benefit of the
lack of © protection?
DATA SOURCE: WHY WIKIPEDIA?
 Everyone
agrees that Wikipedia is a
valuable resource
 Public domain photos add value to pages
 Data about use of photos is transparent
and accessible
 See http://en.wikipedia.org/wiki/Amy_Tan
 See http://stats.grok.se/
VALUING WHAT?
 How
to calculate the value of a copyright
in a photo to its owner?
 How to calculate the value of a copyright
in a photo to the public?
 How to calculate value of the absence of
legal protection to the public?
 Private value ≠ public welfare
POLLOCK HYPO
A copyright book sells for $10 in the book shop. It
falls into the public domain and now sells in the
shop for $5 and is available for free on the
internet.
 Has the value of the book changed?
 Less valuable to the former copyright owner
 More valuable to the public (cheaper)
 Should policymakers encourage the change in
legal status?
 As long as the book remains accessible, we see an
increase in consumer surplus of $5-$10 per
copy.
 So why ever protect a work with copyright?

RESEARCH QUESTIONS
Is a sample of Wikipedia web pages more likely
to contain an image when a public domain
work is available?
 To what extent does the availability of public
domain images lower the cost of web page
building?
 To what extent does the addition of an image to a
web page increase traffic to that page?
 Can the total value of both cost savings and
increased traffic due to the use of public domain
images on Wikipedia be quantified by reference
to the characteristics of the sample of Wikipedia
pages?

PHASE I: BESTSELLING AUTHORS







Identify 365 authors with New York Times year-end
bestselling novels in the United States from 1895 to
1965 and collect data for each author:
Number of bestsellers, date of first bestseller, birth
and death date of author;
Wikipedia URL of author page and date image of
author (if any) added;
Copyright status of any author image and legal
justification for any image in the public domain;
Number of Amazon reviews of most popular book for
each author;
Number of page views in March, April, and May of
2009 and 2014.
Word count on author page as of June 2009 and June
2014
OLDER AUTHORS = MORE IMAGES

Public domain effect means that older authors
(counter-intuitively) have more images:
Bestselling Authors by Date of Birth
Percent with Image on Wiki Page
0.93
0.92
0.82
0.81
0.61
0.58
0.52
0.46
0.54
<1850 <1860 <1870 <1880 <1890 <1900 <1910 <1920 <1940
n=15 n=25 n=46 n=52 n=68 n=53 n=49 n=35 n=28
362 Bestselling Authors by Date of Death
Percent with Image on Wiki Page
0.9
0.94
0.8
0.8
0.76
0.69
0.56
0.63
0.6
0.53
0.31
<1910 <1920 <1930 <1940 <1950 <1960 <1970 <1980 <1990 <2000 <2014
n=10 n=16 n=30 n=39 n=49 n=45 n=52 n=28 n=34 n=26 n=33
SOURCE OF IMAGES?
Legal Status of Author Images
Percent
0.79
0.21
Copyrighted
Public Domain
Justification for Image Use
Percent
0.54
0.13
0.07
0.12
0.13
PRELIMINARY CONCLUSION
 The
Public Domain clearly increases the
number of photos on Wiki web pages.
 This adds value, but how much?
 Direct value might be measured in costs
saved to page builders
 Indirect value might be measured in term
of increased traffic to web sites with
images.
http://www.koozai.com/blog/searchmarketing/content-marketingseo/increase-traffic-with-images/
COSTS SAVED: KIPLING ET AL . . .
Free on Wikimedia
Commons
License for 1 Year: $105
on Cobis and $117 on
Getty Images
COSTS SAVED . . .
 25
authors have public domain images
exactly the same as those licensed by
Corbis or Getty
 104 more have public domain images
similar to those licensed by Corbis or
Getty
 Average yearly license = $120
 Page builders saved approximately
$77,400 over a five-year period (129
public domain images x $120/year x 5
years).
INCREASED TRAFFIC?
 Authors
with images had a total of 6.8
million views during March, April, and May
of 2014
 Authors without images had a total of
386,000 views during March, April, and May
of 2014
 Suggests serious need to adjust for author
popularity, but . . .
 Adjusting for a page’s word count seems
unnecessary. (From June 2009 to June
2014, word count for authors with images
when up 68% while over the same period word
count for authors without images went up
67%).
BIG PROBLEM . . .
 How
would you adjust for differences in
traffic caused by the popularity of the
author?
 Ernest Hemingway is very popular
 Maarten Maartens, not so much . . .
ADJUSTING FOR POPULARITY #1
 As
a measure of popularity, the number of
Amazon reviews for each author’s most
reviewed book was counted.
 Authors were grouped according to the
Amazon review number: 0-9, 10-29, 30-99,
100-200.
 Authors with more than 200 customer
reviews were omitted: 47 with images; 5
without.
Median Page Views: March,
April, & May 2014
11575
Authors with Image
Authors without Image
5224
2590
1326
758
0-9 Reviews
N=76/57
1595
5168
2436
10-29 Reviews 30-99 Reviews
N=36/21
N=43/21
100-199
Reviews
N=32/14
ADJUSTING FOR POPULARITY #2
 40
pairs of authors without images on
June 1, 2009 were matched together
based on similar or exact number page
views counted during the months of
March, April, and May 2009.
 This created a set of pairs of authors of
similar popularity at a time when none of
them had images on their web pages.
 Half of the authors received an image
before March 1, 2014, and one-half did
not.
MATCHED PAIRS METHODOLOGY
 In
March, April, & May of 2009, Gwen
Davis page [no image] had 544 views.
 In March, April, & May of 2009, James
Will page [no image] had 542 views.
 In
March, April, & May of 2014 Gwen
Davis [image added 2011] had 675 page
views.
 In March, April, & May 2014 James Will
[no image] had 525 page views.
6% Percent Traffic Increase from
June 2009 to June 2014
0.35
0.3
0.25
0.2
0.15
Traffic Increase
0.1
0.05
0
Authors with
Images
Authors without
Images
ADJUSTING FOR POPULARITY #3
Identified the lowest traffic month for each
author in the year prior to June 2009 and June
2014.
 42 tightly matched pairs of authors with and
without images based on lowest traffic month in
the year prior to 2009.
 Authors with images showed a 36% increase in
traffic from 2009-2014, while authors without
images showed a 19% increase.
 Net increase associated with image use = 17%

COMPOSERS AND LYRICISTS :
ADJUSTING FOR POPULARITY #4



77 pairs and compared the number page views
during the period of March, April, and May 2009
before any composer or lyricist page acquired an
image, with the number of page views in March,
April, and May of 2014, after half of the pages
acquired an image.
Tightly matched. Pages that never acquired an
image had 209,116 aggregate page views in March,
April, and May of 2009, while pages that later
acquired an image had 209,294
Between 2009 and 2014, the traffic to pages with
images increased 56% while the traffic to pages
without images increased only 34%, resulting in a net
increase in traffic to pages with images of 22%.
COMPOSERS AND LYRICISTS FOR MORE
DATA POINTS: ADJUSTING FOR
POPULARITY #5
 68
tightly matched pairs based on the
lowest traffic month for each composer
and lyricist in 2009 before any sample
page contained an image.
 Over the five-year period, traffic to pages
with images increased 40% while the
traffic to pages without images increased
only 21%, resulting in a net increase of
19%.
INCREASED TRAFFIC DUE TO IMAGES ON
WIKIPEDIA PAGES?
 Amazon
Review Adjustment = 100%
 Matched Pairs #1 (authors) = 6%
 Matched
Pairs #2 (authors) = 22%
 Matched Pairs #3 (composers) = 17%
 Matched Pairs #4 (composers) = 19%
EXTRAPOLATING FROM RANDOM PAGES
 300
random pages studied
 50% contain images
 87% of images are in the public domain
 The pages can be categorized: 25%
(Places), 27% (Biographical), 5% (Events),
and 43% (Things)
EXTRAPOLATING COSTS SAVED . . .
 4,560,201
[total Wikipedia pages as of
July 18, 2014] x .50 x .87 = 2,000,000
 Given that Corbis and Getty routinely
charge $105 and $117 dollars respectively
to license a photographic image for a year
on the internet, this suggests a net
savings of $208 million to $232 million
per year.
EXTRAPOLATING INCREASED TRAFFIC
 4,560,021[total
Wiki pages as of 6/14] x
 .5 [percentage of pages with images] x
 .87 [percentage of pages with public
domain images] x
 18,966 [average page views per year] x
 .0053 [average value of a Wikipedia page
view] x
 .19 [percent of traffic due to public
domain image] =
 $37,884,478.77 per year traffic value
ROBUSTNESS CHECK: WILLINGNESS TO PAY?
240 authors with images received
approximately 28 million page views in
2014. Hypothetical cost of licenses =
approximately $28,000 (240 x $120/year).
Per page view cost = 1/10 of a penny.
 If the 19% traffic increase figure is
correct, then images drove 5,320,000 of
our author’s page views in 2014. If the
WebInDetail estimate of a $.0053 value for
each Wikipedia page view is also correct,
then the advertising value of the images
on our author web pages is $28,196.

OH, AND BUY MY BOOK!
SOURCES




Buccafusco, Christopher & Paul
Heald. 2013. “Do Bad Things
Happen When Works Fall into the
Public Domain?: Empirical Tests of
Copyright Term Extension,” 28
Berkeley Journal of Law &
Technology 1-43.
Brooks, Tim. 2005. Survey of
Reissues of U.S. Recordings.
Washington, D.C.: Library of
Congress, available at
http://www.clir.org/pubs/reports/pub
133.
Crook, John R. 2013. “U.S.
Supports New Treaty to Facilitate
Visually Impaired Persons’ Access
to Book,” 107 American Journal of
Int’l Law 933-34.
David, Paul & Jared Rubin. 2008.
“Restricting Access to Books on the
Internet: Some Unanticipated
Effects of U.S. Copyright
Legislation,” 5 Review of Economic
Research on Copyright Issues 2353.



Erickson, Christopher, Paul J.
Heald, and Martin Kretschmer.
2015. “The Valuation of
Unprotected Works: A Case Study
of Public Domain Images on
Wikipedia,” 28 Harvard Journal of
Law & Technology ___.
Favale, Marcella, et al. 2013.
Copyright and the Regulation of
Foreign Works: A Comparative
Review of Seven Jurisdictions and a
Rights Clearance Simulation.
London: Intellectual Property
Office.
Ginsburg, Jane. 2000. “From
Having Copies to Experiencing
Works: The Development of an
Access Right in U.S. Copyright
Law,” in Hugh Hansen (ed.), U.S.
Intellectual Property: Law &
Policy. Sweet & Maxwell: London.
SOURCES




Heald, Paul J. 2008a. “Property
Rights and the Efficient
Exploitation of Copyrighted
Works: An Empirical Analysis of
Public Domain and Copyrighted
Fiction Bestsellers,” 93
Minnesota Law Review 1031-63.
Heald, Paul J. 2008b. “Optimal
Remedies for Patent
Infringement: A Transactions
Cost Approach,” 45 Houston Law
Review 1165-1200.
Heald, Paul J. 2014a. “How
Secondary Liability Rules Create
a Market for Music on YouTube,”
82 University of Missouri-Kansas
City Law Review 313-26.
Heald, Paul J. 2014b. “How
Copyright Keeps Works
Disappeared,” 11 Journal of
Empirical Legal Studies 829-66.




Landes, William & Richard
Posner. 2003. The Economic
Structure of Intellectual Property
Law. Boston: Belknap Press.
Liebowitz, Stan & Stephen
Margolis. 2005. “17 Famous
Economists Weigh in on
Copyright: The Role of Theory,
Empirics, and Network Effects,”
18 Harvard Journal of Law and
Technology 435-57.
Liebowitz, Stan J. 2009. “The
Myth of Copyright Inefficiency,”
32 Journal of Regulation 28-34.
Loren, Lydia. 2007. “Building a
Reliable Semi-Commons of
Creative Works: Enforcement of
Creative Commons License and
the Limited Abandonment of
Copyright,” 14 George Mason
Law Review 271-328.
SOURCES





Lunney, Glynn. 1996. “Reexamining
Copyright’s Incentives-Access Paradigm,”
49 Vanderbilt Law Review 483-656.
Mueller-Langer, Frank & Richard Watt.
2010. “Copyright and Open Access for
Copyrighted Works,” 7 Review of
Economic Research on Copyright Issues
45-65.
Schonwetter, Tobias, et al. 2009-2010.
“Copyright and Education: Lessons from
African Copyright and Access to
Knowledge,” African Journal of
Information and Communication 37-52.
Smith, Michael, Rahul Telang, and Yi
Zhang. 2012. “Analysis of the Potential
Market for Out-of-Print eBooks,”
available at
http://papers.ssrn.com/sol3/papers.cfm?ab
stract_id=2141422.
Suzor, Nicholas. 2013. “Access,
Progress, and Fairness: Rethinking
Exclusivity in Copyright,” 15 Vanderbilt
Journal of Entertainment & Technology
Law 297-342.
Download