V ALUING P UBLIC D OMAIN I MAGES ON W IKIPEDIA AND W HY IT M ATTERS Paul J. Heald Richard W. & Marie L. Corman Research Professor College of Law, University.

Post on 14-Dec-2015

213 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

Transcript

VALUING PUBLIC DOMAIN IMAGES ON WIKIPEDIA AND WHY IT MATTERS

Paul J. HealdRichard W. & Marie L. Corman Research ProfessorCollege of Law, University of Illinois

University of Glasgow, CREATe (RCUK Centre for Copyright and New Business Models in the Creative Economy)

PART I: THE PROBLEM AND WHY IT MATTERS

Retroactive extension of the copyright term Nothing has fallen into the public domain in

the U.S. due to expiration since 1998. 1923! Justification? “Bad things happen when

works fall into the public domain.” [allegations of non-use and over-use]

Reality? The Problem of the Missing Works . . .

Empirical evidence as relevant to the policy debate.

PART II: VALUATION OF PUBLIC DOMAIN WORKS

Measuring what the creative industries lose when works don’t fall into the public domain and disappear.

Valuing public domain images on Wikipedia as a positive example

Possible application for thinking about valuation in litigation and transactions

too . . .

1800

1820

1840

1860

1880

1900

1920

1940

1960

1980

2000

0

50

100

150

200

250

300

350

400

2317 New Editions from Amazon by Decade

Fiction & Non-Fiction Books

1800

1820

1840

1860

1880

1900

1920

1940

1960

1980

2000

0

0.05

0.1

0.15

0.2

0.25

0.3

Estimated Amazon Titles by Percent Per Decade

Fiction & Non-Fiction Books

0

0.05

0.1

0.15

0.2

0.25

Estimated Amazon Book Titles Ad-justed for Total Number of Books

Published Per Decade

WorldCat Ad-justed

CopyReg Ad-justed

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Perc

ent

avai

labl

e (1

=100

%)

Availability of Works

Books Published 1907-22

Books Published 1923-32

AVAILABILITY OF BESTSELLERS PUBLISHED 1913-22 (IN PD) AND 1923-32 (COPYRIGHTED)

DO EBOOKS SOLVE THE PROBLEM?

In 2014, 94% of 165 public domain bestsellers from 1913-32 were available in eBook format, up from 48% in 2006.

Of 167 bestsellers from 1923-32 still under copyright, only 27% (45/167) had been made available as eBooks by publishers by 2014.

And of those 45 copyrighted eBooks, only one was out-of-print in hard copy format.

Market failure?!!

1930's

1940's

1950's

1960's

1970's

1980's

1990's

2000's

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

Percentage of 800 NYT Reviewed Books in eBook Format by Decade

2014 eBooks

EVIDENCE OF DEMAND FOR MISSING WORKS

Smith, Telang, and Zhang, “Analysis of the Potential Market for Out-of-Print eBooks,” http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2141422 (2012)

Authors used matched pairs analysis to estimate a $740 million eBook market for out-of-print titles.

Why do publishers seems to leave money on the table?

EVIDENCE OF DEMAND FOR MISSING WORKS

0

1000000

2000000

3000000

4000000

5000000

6000000

7000000

8000000

9000000

10000000

Initial Publication Dates of New (Amazon) and Used Books (Abe Books) for Sale 2012-2013

Used BooksNew Books

x

EXPLOITATION LEVELS OF AUDIO BOOKS

What Is Available in Audio Book Version? 33% public domain titles 16% copyrighted titles 80% of top 20 copyrighted titles 100% of top 20 Public Domain titles

WHAT ELSE IS AT STAKE: PENGUIN CLASSICS PRICING DATA

48 Copyrighted Books: Average price per book ($14.60)Average length (310 pages)Average price per page ($.047)

48 Public Domain Books:--Average Price per book ($11.10)--Average length (374 pages)--Average price per page ($.03)

AUDIO BOOK PRICING DATA

for the top 20 PD titles, $0.038 (CD) $0.028 (MP3) Average price, per minute playing time

based on the lowest price version at audible.com

For top 20 copyright titles $0.05 (CD) $0.036 (MP3)

DOES LACK OF OWNERSHIP CAUSE OVERUSE?

Copyright owners license compositions once

every 3.3 years.

Public domain compositionsused once every 3.8 years.

No evidence of over-grazing looking at the songs as a group

OVER-USE?, CONT’D

The two most exploited PD songs are: Danny Boy (9 movies from 1993-

2001) After You’ve Gone (9 movies, 1996-

2006) Copyrighted Songs in 1930’s

Sweet Georgia Brown (15 movies), Am I Blue? (17 movies) Happy Days Are Here Again (34

movies) Copyrighted songs more recently;

Blues Skies (10 movies from 1994-

2004) Stardust (10 movies in the 1990’s)Dream a Little Dream of Me (10

movies from 1995-2005)

VALUING PUBLIC DOMAIN IMAGES ON WIKIPEDIA AND WHY IT MATTERS

Paul J. HealdRichard W. & Marie L. Corman Research ProfessorCollege of Law, University of Illinois

University of Glasgow, CREATe (RCUK Centre for Copyright and New Business Models in the Creative Economy)

WIKIPEDIA RESEARCH . . . CALCULATING THE VALUE OF THE PUBLIC DOMAIN

CONTEXT

UK Intellectual Property Office wants to know! Debate over retroactive extension of the copyright term.

Evaluating the benefits of orphan works legislation

Exercise in valuation with applicability to damage calculation in cases of image infringement on the web, e.g. how much is an infringer unjustly enriched by appropriating an image.

Prior research on the cost of © Protection:

How to evaluate the positive benefit of the lack of © protection?

DATA SOURCE: WHY WIKIPEDIA?

Everyone agrees that Wikipedia is a valuable resource

Public domain photos add value to pages

Data about use of photos is transparent and accessible

See http://en.wikipedia.org/wiki/Amy_Tan

See http://stats.grok.se/

VALUING WHAT?

How to calculate the value of a copyright in a photo to its owner?

How to calculate the value of a copyright in a photo to the public?

How to calculate value of the absence of legal protection to the public?

Private value ≠ public welfare

POLLOCK HYPO A copyright book sells for $10 in the book

shop. It falls into the public domain and now sells in the shop for $5 and is available for free on the internet.

Has the value of the book changed? Less valuable to the former copyright owner More valuable to the public (cheaper) Should policymakers encourage the change in

legal status? As long as the book remains accessible, we

see an increase in consumer surplus of $5-$10 per copy.

So why ever protect a work with copyright?

RESEARCH QUESTIONS

Is a sample of Wikipedia web pages more likely to contain an image when a public domain work is available?

To what extent does the availability of public domain images lower the cost of web page building?

To what extent does the addition of an image to a web page increase traffic to that page?

Can the total value of both cost savings and increased traffic due to the use of public domain images on Wikipedia be quantified by reference to the characteristics of the sample of Wikipedia pages?

PHASE I: BESTSELLING AUTHORS Identify 365 authors with New York Times year-end

bestselling novels in the United States from 1895 to 1965 and collect data for each author:

Number of bestsellers, date of first bestseller, birth and death date of author;

Wikipedia URL of author page and date image of author (if any) added;

Copyright status of any author image and legal justification for any image in the public domain;

Number of Amazon reviews of most popular book for each author;

Number of page views in March, April, and May of 2009 and 2014.

Word count on author page as of June 2009 and June 2014

OLDER AUTHORS = MORE IMAGES

Public domain effect means that older authors (counter-intuitively) have more images:

<18

50 n

=15

<18

60 n

=25

<18

70 n

=46

<18

80 n

=52

<18

90 n

=68

<19

00 n

=53

<19

10 n

=49

<19

20 n

=35

<19

40 n

=28

0.93 0.92 0.82 0.810.61 0.58 0.52 0.46 0.54

Bestselling Authors by Date of Birth

Percent with Image on Wiki Page

<19

10 n

=10

<19

20 n

=16

<19

30 n

=30

<19

40 n

=39

<19

50 n

=49

<19

60 n

=45

<19

70 n

=52

<19

80 n

=28

<19

90 n

=34

<20

00 n

=26

<20

14 n

=33

0.9 0.94

0.8 0.8 0.76

0.56 0.60.69

0.53

0.31

0.63

362 Bestselling Authors by Date of Death

Percent with Image on Wiki Page

SOURCE OF IMAGES?

Copyrighted Public Domain

0.21

0.79

Legal Status of Author Images

Percent

Coprig

ht-F

air U

se

Coprig

ht-P

erm

ission

PD-D

edicat

ed

PD-E

xpiry

PD-O

ther

0.130.07

0.12

0.54

0.13

Justification for Image Use

Percent

PRELIMINARY CONCLUSION

The Public Domain clearly increases the number of photos on Wiki web pages.

This adds value, but how much? Direct value might be measured in

costs saved to page builders Indirect value might be measured in

term of increased traffic to web sites with images. http://www.koozai.com/blog/search-marketing/content-marketing-seo/increase-traffic-with-images/

COSTS SAVED: KIPLING ET AL . . .

Free on Wikimedia Commons

License for 1 Year: $105 on Cobis and $117 on Getty Images

COSTS SAVED . . .

25 authors have public domain images exactly the same as those licensed by Corbis or Getty

104 more have public domain images similar to those licensed by Corbis or Getty

Average yearly license = $120 Page builders saved approximately

$77,400 over a five-year period (129 public domain images x $120/year x 5 years).

INCREASED TRAFFIC?

Authors with images had a total of 6.8 million views during March, April, and May of 2014

Authors without images had a total of 386,000 views during March, April, and May of 2014

Suggests serious need to adjust for author popularity, but . . .

Adjusting for a page’s word count seems unnecessary. (From June 2009 to June 2014, word count for authors with images when up 68% while over the same period word count for authors without images went up 67%).

BIG PROBLEM . . .

How would you adjust for differences in traffic caused by the popularity of the author?

Ernest Hemingway is very popular Maarten Maartens, not so much . . .

ADJUSTING FOR POPULARITY #1

As a measure of popularity, the number of Amazon reviews for each author’s most reviewed book was counted.

Authors were grouped according to the Amazon review number: 0-9, 10-29, 30-99, 100-200.

Authors with more than 200 customer reviews were omitted: 47 with images; 5 without.

0-9 Reviews N=76/57

10-29 Reviews N=36/21

30-99 Reviews N=43/21

100-199 Reviews N=32/14

1326

2590

5224

11575

7581595

2436

5168

Median Page Views: March, April, & May 2014

Authors with Image

Authors without Image

ADJUSTING FOR POPULARITY #2

40 pairs of authors without images on June 1, 2009 were matched together based on similar or exact number page views counted during the months of March, April, and May 2009.

This created a set of pairs of authors of similar popularity at a time when none of them had images on their web pages.

Half of the authors received an image before March 1, 2014, and one-half did not.

MATCHED PAIRS METHODOLOGY

In March, April, & May of 2009, Gwen Davis page [no image] had 544 views.

In March, April, & May of 2009, James Will page [no image] had 542 views.

In March, April, & May of 2014 Gwen Davis [image added 2011] had 675 page views.

In March, April, & May 2014 James Will [no image] had 525 page views.

Authors with Images

Authors without Images

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

6% Percent Traffic Increase from June 2009 to June 2014

Traffic Increase

ADJUSTING FOR POPULARITY #3

Identified the lowest traffic month for each author in the year prior to June 2009 and June 2014.

42 tightly matched pairs of authors with and without images based on lowest traffic month in the year prior to 2009.

Authors with images showed a 36% increase in traffic from 2009-2014, while authors without images showed a 19% increase.

Net increase associated with image use = 17%

COMPOSERS AND LYRICISTS : ADJUSTING FOR POPULARITY #4

77 pairs and compared the number page views during the period of March, April, and May 2009 before any composer or lyricist page acquired an image, with the number of page views in March, April, and May of 2014, after half of the pages acquired an image.

Tightly matched. Pages that never acquired an image had 209,116 aggregate page views in March, April, and May of 2009, while pages that later acquired an image had 209,294

Between 2009 and 2014, the traffic to pages with images increased 56% while the traffic to pages without images increased only 34%, resulting in a net increase in traffic to pages with images of 22%.

COMPOSERS AND LYRICISTS FOR MORE DATA POINTS: ADJUSTING FOR POPULARITY #5

68 tightly matched pairs based on the lowest traffic month for each composer and lyricist in 2009 before any sample page contained an image.

Over the five-year period, traffic to pages with images increased 40% while the traffic to pages without images increased only 21%, resulting in a net increase of 19%.

INCREASED TRAFFIC DUE TO IMAGES ON WIKIPEDIA PAGES?

Amazon Review Adjustment = 100% Matched Pairs #1 (authors) = 6%

Matched Pairs #2 (authors) = 22% Matched Pairs #3 (composers) = 17% Matched Pairs #4 (composers) = 19%

EXTRAPOLATING FROM RANDOM PAGES

300 random pages studied 50% contain images 87% of images are in the public

domain The pages can be categorized: 25%

(Places), 27% (Biographical), 5% (Events), and 43% (Things)

EXTRAPOLATING COSTS SAVED . . .

4,560,201 [total Wikipedia pages as of July 18, 2014] x .50 x .87 = 2,000,000

Given that Corbis and Getty routinely charge $105 and $117 dollars respectively to license a photographic image for a year on the internet, this suggests a net savings of $208 million to $232 million per year.

EXTRAPOLATING INCREASED TRAFFIC

4,560,021[total Wiki pages as of 6/14] x .5 [percentage of pages with images] x .87 [percentage of pages with public

domain images] x 18,966 [average page views per year] x .0053 [average value of a Wikipedia

page view] x .19 [percent of traffic due to public

domain image] = $37,884,478.77 per year traffic value

ROBUSTNESS CHECK: WILLINGNESS TO PAY?

240 authors with images received approximately 28 million page views in 2014. Hypothetical cost of licenses = approximately $28,000 (240 x $120/year). Per page view cost = 1/10 of a penny.

If the 19% traffic increase figure is correct, then images drove 5,320,000 of our author’s page views in 2014. If the WebInDetail estimate of a $.0053 value for each Wikipedia page view is also correct, then the advertising value of the images on our author web pages is $28,196.

OH, AND BUY MY BOOK!

SOURCES Buccafusco, Christopher & Paul

Heald. 2013. “Do Bad Things Happen When Works Fall into the Public Domain?: Empirical Tests of Copyright Term Extension,” 28 Berkeley Journal of Law & Technology 1-43.

Brooks, Tim. 2005. Survey of Reissues of U.S. Recordings. Washington, D.C.: Library of Congress, available at http://www.clir.org/pubs/reports/pub133.

Crook, John R. 2013. “U.S. Supports New Treaty to Facilitate Visually Impaired Persons’ Access to Book,” 107 American Journal of Int’l Law 933-34.

David, Paul & Jared Rubin. 2008. “Restricting Access to Books on the Internet: Some Unanticipated Effects of U.S. Copyright Legislation,” 5 Review of Economic Research on Copyright Issues 23-53.

Erickson, Christopher, Paul J. Heald, and Martin Kretschmer. 2015. “The Valuation of Unprotected Works: A Case Study of Public Domain Images on Wikipedia,” 28 Harvard Journal of Law & Technology ___.

Favale, Marcella, et al. 2013. Copyright and the Regulation of Foreign Works: A Comparative Review of Seven Jurisdictions and a Rights Clearance Simulation. London: Intellectual Property Office.

Ginsburg, Jane. 2000. “From Having Copies to Experiencing Works: The Development of an Access Right in U.S. Copyright Law,” in Hugh Hansen (ed.), U.S. Intellectual Property: Law & Policy. Sweet & Maxwell: London.

SOURCES Heald, Paul J. 2008a. “Property

Rights and the Efficient Exploitation of Copyrighted Works: An Empirical Analysis of Public Domain and Copyrighted Fiction Bestsellers,” 93 Minnesota Law Review 1031-63.

Heald, Paul J. 2008b. “Optimal Remedies for Patent Infringement: A Transactions Cost Approach,” 45 Houston Law Review 1165-1200.

Heald, Paul J. 2014a. “How Secondary Liability Rules Create a Market for Music on YouTube,” 82 University of Missouri-Kansas City Law Review 313-26.

Heald, Paul J. 2014b. “How Copyright Keeps Works Disappeared,” 11 Journal of Empirical Legal Studies 829-66.

Landes, William & Richard Posner. 2003. The Economic Structure of Intellectual Property Law. Boston: Belknap Press.

Liebowitz, Stan & Stephen Margolis. 2005. “17 Famous Economists Weigh in on Copyright: The Role of Theory, Empirics, and Network Effects,” 18 Harvard Journal of Law and Technology 435-57.

Liebowitz, Stan J. 2009. “The Myth of Copyright Inefficiency,” 32 Journal of Regulation 28-34.

Loren, Lydia. 2007. “Building a Reliable Semi-Commons of Creative Works: Enforcement of Creative Commons License and the Limited Abandonment of Copyright,” 14 George Mason Law Review 271-328.

SOURCES Lunney, Glynn. 1996. “Reexamining

Copyright’s Incentives-Access Paradigm,” 49 Vanderbilt Law Review 483-656.

Mueller-Langer, Frank & Richard Watt. 2010. “Copyright and Open Access for Copyrighted Works,” 7 Review of Economic Research on Copyright Issues 45-65.

Schonwetter, Tobias, et al. 2009-2010. “Copyright and Education: Lessons from African Copyright and Access to Knowledge,” African Journal of Information and Communication 37-52.

Smith, Michael, Rahul Telang, and Yi Zhang. 2012. “Analysis of the Potential Market for Out-of-Print eBooks,” available at http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2141422.

Suzor, Nicholas. 2013. “Access, Progress, and Fairness: Rethinking Exclusivity in Copyright,” 15 Vanderbilt Journal of Entertainment & Technology Law 297-342.

top related