Top Banner
RLG Programs Copyright and Large-scale Digitization: Implications for Access Merrilee Proffitt Constance Malpas RLG Programs CNI Fall Task Force Washington, DC 10 December 2007
19

Cni Dec 2007 Copyright And Mass Dig For Cni

Dec 17, 2014

Download

Education

Nancy Elkington

CNI Fall Task Force Presentation: Copyright and Large-scale Digitization: Implications for Access, by Merrilee Proffitt and Constance Malpas with RLG Programs
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Cni Dec 2007 Copyright And Mass Dig For Cni

RLG Programs

Copyright and Large-scale Digitization: Implications for Access

Merrilee ProffittConstance MalpasRLG Programs

CNI Fall Task Force Washington, DC10 December 2007

Page 2: Cni Dec 2007 Copyright And Mass Dig For Cni

RLG Programs Copyright and Large-scale Digitization

CNI Fall Task Force Meeting - 10 December 2007

2

This presentation . . .

Summarizes findings from conversations with RLG Program Partners regarding copyright assessment practice

and considers the implications of these practices in light of What we know about the system-wide book

collection (‘supply’) What we can observe about need and use of that

collection (‘demand’) Speculations about how increased discoverability

of digitized text may impact use (and management) of library print collections

Page 3: Cni Dec 2007 Copyright And Mass Dig For Cni

RLG Programs Copyright and Large-scale Digitization

CNI Fall Task Force Meeting - 10 December 2007

3

Interviews with RLG Programs Partners

8 interviewees; some (not all) engaged in mass digitization

All identify “high-risk materials” in order to eliminate them from pool, focus making as much low-risk content available as possible

Books, published in the US, before 1923 Not a lot of effort devoted to this work at this time Some well-established numbers from University

of Michigan on costs for “low-hanging fruit” and for identifying low-risk materials to 1963

Left aside are riskier materials to 1963; materials published outside of US; materials after 1963

Page 4: Cni Dec 2007 Copyright And Mass Dig For Cni

RLG Programs Copyright and Large-scale Digitization

CNI Fall Task Force Meeting - 10 December 2007

4

1923-1963: How much? What’s the impact on research and teaching?

Based on a January 2007 snapshot of WorldCat, we can estimate that ~15% of US imprints were published between 1923-1963; ~2M titles

Independent studies at Stanford and Michigan suggest that ~30% of US imprints are in copyright; up to 70% may be in the public domain

An optimistic scenario: ~2M * .70 = ~1.4M titles Add to this the pre-1923 books already in the

public domain, est. ~15% of US imprints; optimistically, a total of ~3.4M titles, or the volume equivalent of a mid-level ARL collection

Suppose we go as far as we can with this? What’s the likely impact?

Page 5: Cni Dec 2007 Copyright And Mass Dig For Cni

RLG Programs Copyright and Large-scale Digitization

CNI Fall Task Force Meeting - 10 December 2007

5

Based on historical samples of monographic titles in theWorldCat database: 15-20% published (anywhere) before 1923; ~10-14M titles 15% published (anywhere) 1923-1963; ~10M titles

US imprints only (i.e., the titles for which North Americanlibraries might reasonably expect to undertake copyrightassessment efforts) based on a random sample of 1000 monographic titles: 15% published before 1923 public domain 15% published 1923-1963 moderate risk/effort 30% published 1964-1988 high risk/effort 27% published after 1989 greatest risk/effort 7% ambiguous pub’n data unknown risk/effort

Supply: the system-wide book collection

Page 6: Cni Dec 2007 Copyright And Mass Dig For Cni

RLG Programs Copyright and Large-scale Digitization

CNI Fall Task Force Meeting - 10 December 2007

6

Incr

easi

ng ris

k =

incr

ease

d re

war

d?

Distribution of Content by US Copyright Regimebased on a random sample of US imprints

Books published between1923 – 1963 are onlypart of the picture

Page 7: Cni Dec 2007 Copyright And Mass Dig For Cni

RLG Programs Copyright and Large-scale Digitization

CNI Fall Task Force Meeting - 10 December 2007

7

US imprints in 1000 rec sample

010203040506070

1700

s

1800

s

1900

s

1910

s

1920

s

1930

s

1940

s

1950

s

1960

s

1970

s

1980

s

1990

s

2000

-200

7

Decade of Publication

Tit

les

in S

amp

le

200 years of production 15% of sample

4 decades15% of sample

13 yrs17%

10 yrs19%

18 yrs27%

US imprints in 1000 record sample

Period of Publication

Page 8: Cni Dec 2007 Copyright And Mass Dig For Cni

RLG Programs Copyright and Large-scale Digitization

CNI Fall Task Force Meeting - 10 December 2007

8

US imprints in 1000 rec sample

010203040506070

1700

s

1800

s

1900

s

1910

s

1920

s

1930

s

1940

s

1950

s

1960

s

1970

s

1980

s

1990

s

2000

-200

7

Decade of Publication

Tit

les

in S

amp

le

US imprints in 1000 rec sample

010203040506070

1700

s

1800

s

1900

s

1910

s

1920

s

1930

s

1940

s

1950

s

1960

s

1970

s

1980

s

1990

s

2000

-200

7

Decade of Publication

Tit

les

in S

amp

le

~74% of US books will require more

work, other players

Optimistically, ~26% of US imprints could be made accessiblewith some research

Page 9: Cni Dec 2007 Copyright And Mass Dig For Cni

RLG Programs Copyright and Large-scale Digitization

CNI Fall Task Force Meeting - 10 December 2007

9

010203040506070

1700

s

1800

s

1900

s

1910

s

1920

s

1930

s

1940

s

1950

s

1960

s

1970

s

1980

s

1990

s

2000

-200

7

Decade of Publication

Tit

les

in S

amp

leWhat’s missing from this picture?

Period of Publication

Page 10: Cni Dec 2007 Copyright And Mass Dig For Cni

RLG Programs Copyright and Large-scale Digitization

CNI Fall Task Force Meeting - 10 December 2007

10

What’s missing from this picture?

010203040506070

1700

s

1800

s

1900

s

1910

s

1920

s

1930

s

1940

s

1950

s

1960

s

1970

s

1980

s

1990

s

2000

-200

7

Decade of Publication

Tit

les

in S

amp

le

Period of Publication

Period of Publication

Holdings for US imprints in 1000 record sample

0

10

20

30

40

50

60

70

Decade of Publication

Titles

Holdings

Page 11: Cni Dec 2007 Copyright And Mass Dig For Cni

RLG Programs Copyright and Large-scale Digitization

CNI Fall Task Force Meeting - 10 December 2007

11

010203040506070

1700

s

1800

s

1900

s

1910

s

1920

s

1930

s

1940

s

1950

s

1960

s

1970

s

1980

s

1990

s

2000

-200

7

Decade of Publication

Tit

les

in S

amp

leWhat’s missing from this picture?

Period of Publication

Period of Publication

Holdings for US imprints in 1000 record sample

0

10

20

30

40

50

60

70

Decade of Publication

Titles

Holdings

While holdings : titles increase over time, aggregate supply dips in the period when copyright restrictions are most onerous

Median holdings per manifestation = 2

Max. holdings for a single manifestation = 737

Page 12: Cni Dec 2007 Copyright And Mass Dig For Cni

RLG Programs Copyright and Large-scale Digitization

CNI Fall Task Force Meeting - 10 December 2007

12

27%

69%

4%

27%

69%

4%

Books published elsewhere

US imprints

?

What’s missing from this picture?

Books published outside of the United States

Based on January 2007 snapshot of published print books in WorldCatn = 48M titles

Page 13: Cni Dec 2007 Copyright And Mass Dig For Cni

RLG Programs Copyright and Large-scale Digitization

CNI Fall Task Force Meeting - 10 December 2007

13

Other Dimensions of Supply

What about holdings/availability?In our sample of US imprints: ~90% of titles with >50 holdings were published after 1963 All titles with >300 holdings were published after 1963 Work-level holdings may help fill the gap for titles with sparse

holdings at manifestation level; mostly for teaching/learning What about non-US book titles?

Based on a January 2007 snapshot of WorldCat: US imprints account for ~30% of the global book collection;

non-US publications account for ~70% of print book records in WorldCat

Holdings for non-US publications are relatively scarce (viz. OCLC/ARL Global Resources report, 2007)

Place of publication not always explicit – add’l research needed before copyright assessment can even begin

What about non-book materials?Monographs are just one part of the scholarly record

Page 14: Cni Dec 2007 Copyright And Mass Dig For Cni

RLG Programs Copyright and Large-scale Digitization

CNI Fall Task Force Meeting - 10 December 2007

14

Page 15: Cni Dec 2007 Copyright And Mass Dig For Cni

RLG Programs Copyright and Large-scale Digitization

CNI Fall Task Force Meeting - 10 December 2007

15

Page 16: Cni Dec 2007 Copyright And Mass Dig For Cni

RLG Programs Copyright and Large-scale Digitization

CNI Fall Task Force Meeting - 10 December 2007

16

Page 17: Cni Dec 2007 Copyright And Mass Dig For Cni

RLG Programs Copyright and Large-scale Digitization

CNI Fall Task Force Meeting - 10 December 2007

17

Demand: What access is needed to support scholarship?

Citations to US imprints (monographs only)

-1922 1923 -1963

1964 -1977

1978 -

1988

1989 -

Lawrence and AaronsohnUS imprints account for only 1/3 of works cited

8 28 12 9 40

Shakespeare the

ThinkerUS imprints account for less than ¼ of works cited

1 16 12 8 5

The First WordAlmost all monographs cited published in the US. 2/3 of sources were from journal literature (not counted) 0 2 5 5 70

29% 12% 9% 41% 8%

38% 29% 19% 12%

2% 6%6% 85%

21%4% 13% 9% 52%

2%

Page 18: Cni Dec 2007 Copyright And Mass Dig For Cni

RLG Programs Copyright and Large-scale Digitization

CNI Fall Task Force Meeting - 10 December 2007

18

Consequences of greater discoverability of monographs: Scenario A

Use of print decreases: Learners, teachers, and researchers turn to

what’s available and useable in digital form rather than print materials; use of print collections declines

scope of scholarly record is defined opportunistically, based on what’s most conveniently available

For some fortunate scholars, greater discoverability is accompanied by greater rights to use digitized text – but availability is determined by institutional affiliation

inequitable access to ‘liquid text’ produces an uneven body of scholarly analysis; incentives to create new analytic tools are limited

Page 19: Cni Dec 2007 Copyright And Mass Dig For Cni

RLG Programs Copyright and Large-scale Digitization

CNI Fall Task Force Meeting - 10 December 2007

19

Consequences of greater discoverability of monographs: Scenario B

Use and value of print collections increase: Learners, teachers, and researchers find more

materials online; because they can't get in these digital form, use of print increases. Existing print copies and delivery apparatus can meet

the demand. (But what about shifting models for print?) Existing copes and delivery apparatus can't meet the

need, and that creates an opportunity for someone to do something despite rights restrictions to make print or electronic forms of high-demand materials more available. Must be high-value enough to bring rights holders to the table.

Existing copies and delivery apparatus can't meet the need but there isn't enough incentive for anyone to solve this problem.