Top Banner
Data and the law Dorothea Salo
58
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Data and the Law

Data and the lawDorothea Salo

Page 2: Data and the Law

Copy!ght and

"e #gital humanities

http

://w

ww

.flic

kr.c

om/p

hoto

s/84

2991

43@

N00

/249

0004

138/

Page 3: Data and the Law

Why do you care?•Do you want to USE something that may

be under legal restrictions?•Publishing a photo in a book or ebook•Text-mining (“non-consumptive use”)•Extensive quotation•Classroom use

•Do you want to MAKE something?•... and let other people legally use it? •... and let other people legally use it, but only under certain conditions?•... without them bugging you by email all the time?

•Then you NEED the basics of copyright.

Page 4: Data and the Law

DH and copyright•DHers study and use a lot of copyrighted

objects, in ways that sometimes create risk of infringement (or perceived infringement).

•Copyright creates barriers to accessing materials that DHers would like to study.• Librarians can sometimes help break down these barriers.

•DHers therefore need a base-level understanding of copyright, and a willingness to research beyond the base level.• And sometimes a willingness to take risks!

Page 5: Data and the Law

Sound off!

•News or projects that have copyright implications for DHers?

Page 6: Data and the Law

Stuff you can usewith relative ease

•Public domain stuff•... if you can figure out what that is, cf. “Happy Birthday” lawsuit

•Federal-government stuff•Openly-licensed stuff

•Creative Commons is your friend!

•Other licensed stuff•... but you better follow the terms of the license!

•Other stuff, to an extent: “fair use”

Page 7: Data and the Law

How does

copy!ght w

ork?

http

://w

ww

.flic

kr.c

om/p

hoto

s/84

2991

43@

N00

/249

0004

138/

Page 8: Data and the Law

What is it (in the US)?

•A limited monopoly granted by federal law• over “original works of authorship” that are “fixed in a tangible medium

of expression”*

•‘To promote the progress of science and the useful arts’• Is it still doing that? You decide. But I think not, on the whole.

•Not unlimited! Not forever! By design!

*yes, the Internet counts as “tangible” for copyright purposes

Page 9: Data and the Law

A copyrightable item must minimally...

•Be original• Feist v. Rural Telephone Service: “sweat of the brow” does not suffice to

make something copyrightable• This is one reason you’ll hear “data can’t be copyrighted.”

•Be fixed in some “tangible” form• yes, the Internet counts!!!!

•That’s it. But it wasn’t always.• Registration used to be required, not optional.• If you didn’t renew? You snooze, you lose.• Didn’t put an explicit copyright notice on it? Oops.

Page 10: Data and the Law

Copyright does not cover...•Ideas (only their fixed expression)

• Databases and other fact collections! (notably different overseas)

•Methods, processes, systems (patent!)• Recipes are uncopyrightable. Bet you didn’t know that.• Messy exception: software.

•Words. (trademark!) Titles. Recipes.• Invented languages? Nobody’s sure.• Natural languages? Nope.

•Works by the federal government•Works already in the public domain

• no takebacks! ... except Golan v. Holder.

Page 11: Data and the Law

Copyright DOES cover...•Unpublished material

• more straitly than published! and with different time rules!

•Images and photographs• Are copyrighted! Just like text!• It’s not “fair use” just because you found it on the Internet.• It’s not “fair use” because you give credit; US copyright law says nothing

about credit!

•Sound (and fury)...• Same idea.• Except sound recordings do not fall under federal copyright at present!

Patchwork of state law; talk of “harmonization.”• (Please don’t ask me about sampling. ARGH.)

Page 12: Data and the Law

Copyright lasts...•For something created 1978 or later:

• Life of author plus 70 years• For corporate-created works (often “works for hire,”) 120 years after

creation or 95 years after publication.• Copyright Act of 1976

•For something created between 1923 and 1977:• ... that’s a really good question, because of all the former copyright

formalities that don’t exist now.

•Pre-1923: probably public domain•Once copyright expires, the item is in the

“public domain.”

Page 13: Data and the Law

What’s copyrightable?

Page 14: Data and the Law

Copyfraud•Claiming a copyright that either doesn’t

exist, or is someone else’s.• Bridgeman v. Corel: “slavish copying” of a physical item, as in a

photographic or digitized reproduction, fails copyright’s originality test.

•Not illegal, sadly.•ENDEMIC. Don’t believe every copyright

notice you read!• GLAM are not immune to copyfraud.• Remember: we have the rights we USE and DEFEND. You may have to

intervene with your publishers!

•Recommended: Jason Mazzone

Page 15: Data and the Law

Doctrine of first sale•Owning copyright in a work does not

confer control over legally-made physical copies of that work.

•Buyers who buy legally can share, lend, and resell their copies freely.• They can’t make copies of their legally-obtained copies without

incurring copyright-litigation risk, however.• Wiley v. Kirtsaeng: copies purchased legally overseas ARE subject

to first-sale, CAN be imported and resold legally.

•There is no right of first sale in digital materials. Only physical ones.

Page 16: Data and the Law

Important note

•Everything I’ve told you is for US works.

•Copyright works differently elsewhere! (Yes, despite Berne.)• “Moral rights” of authors• Copyright term length• What is copyrightable

•This is a wretched headache. • If you have an international-copyright question, SEE A LAWYER.

Really. I mean it!

Page 17: Data and the Law

What good

% copy!ght?

http

://w

ww

.flic

kr.c

om/p

hoto

s/84

2991

43@

N00

/249

0004

138/

Page 18: Data and the Law

What can you do* with your copyrighted work?

CopyPerform

All rights sold separately!

Republish

Translate

Adapt

“derivative work”

Broadcast

Arrange

Use as part of a new work

Allow or restrict access

Write a sequel

* and prevent others from doing without permission

Page 19: Data and the Law

What can you do with your copyright?

•Sell it, in whole or in part.•Sign it away without payment.

• For the most part, this is what faculty do with their journal articles.

•License it (i.e. give others permission to use some or all of your rights)• for broad or narrow purposes• temporarily or permanently• “exclusive”ly or non-• free or for pay.• It’s just like any other license. You negotiate it! (With a lawyer around.)

Page 20: Data and the Law

A “copyright transfer agreement” is what it sounds like!

Once you transfer your exclusive copyright over a work to someone else, YOU NO LONGER OWN THE WORK.

You have no say whatever in what is done with or to it,AND YOU CANNOT USE IT AS THOUGH YOU OWNED ITS COPYRIGHT.

Publishers ask you to sign these. KNOW WHAT YOU ARE SIGNING.

Page 21: Data and the Law

Libraries and licensing

•All those nifty ebooks and ejournals the library gets you access to?

•The library pays ridiculous boatloads of money to LICENSE (not own!) them.• No first-sale! These are digital!• And their use is subject to whatever terms the publisher/aggregator

and library signed.

•You can’t treat ‘em like print, sorry!

Page 22: Data and the Law

Working with licensed materials•E.g. text mining, visualizations, etc.

• here’s that “non-consumptive use” “distant reading” thing again...

•Please don’t Just Do It!• Licensors monitor use. If you download a whole bunch at once, they’ll notice,

and they’ll yank access for all of campus.• The worst-case consequences for you could be severe. We learned this, sadly,

from what happened to Aaron Swartz.

•Talk to your librarians.• A given aggregator may have a research program you can join.• Or the library may be able to work out a deal.• Or the licensor, when contacted through the library’s channels, may say “Oh.

Huh. Sure, why not?”

Page 23: Data and the Law

“Orphan work”•Copyright can leave its original owner (via sale

or other transfer), in part or in whole.• Authors die. So do publishers. Wills? Don’t make me laugh.

•Copyright registration has been optional for many years.• It’s not optional if you actually want to sue! But you can still register after an

infringement has taken place.

•Result: large body of copyrighted works whose owners are unknown or unclear.• Especially from the mid-to-late 20th century.

•What about digitization? DH work? Preservation?

Page 24: Data and the Law

Digital

copy!ght

http

://w

ww

.flic

kr.c

om/p

hoto

s/84

2991

43@

N00

/249

0004

138/

Page 25: Data and the Law

Copyright and the digital realm

•Suddenly it’s a lot easier to make perfect copies!

•Some of the workings of the Internet require copies!• Your web browser makes a copy of every page you see• Exception: “streaming media”

•Current media business model is founded upon the difficulty of making perfect copies.

•Solution (?): DRM!

Page 26: Data and the Law

Digital rights management•Technological jiggery-pokery that locks a

digital file into certain uses• By device• By time or number-of-use limits• By software• By user or geography• Examples: various ebook schemes, DVD “zoning”

•Eschenfelder: “technological protection measures”• DRM (“hard” TPM) plus heightened annoyance factors (“soft” TPM)

Page 27: Data and the Law

DRM and the library

•DRMed files present a substantial digital preservation risk

•E-journals and databases could use DRM on their materials...• ... but mostly haven’t, preferring proxy servers and “annoyance

factor” tricks (obfuscation, omission, polyglot, frustration)• And preservation practices for these are fairly well-established.

•Ebooks, however, are another story.

Page 28: Data and the Law

DRM and the law: DMCA•Digital Millennium Copyright Act (1996)

•Illegal to circumvent DRM• For us too! No exceptions for GLAM. Or fair use. Or research.• No, not even for preservation.

•ISPs must take down allegedly copyright-infringing content when notified

•Notable chilling effects• Sklyarov case (2001), Felten case (cryptography), Sony rootkit case• YouTube and other web properties are still struggling with how to manage

DMCA at scale. This has bitten some DHers!

Page 29: Data and the Law

CFAA

•Computer Fraud and Abuse Act

•Meant to go after black-hat hackers•Loose enough wording for prosecutors

to attack any terms-of-service violation•Used against Aaron Swartz, others

•“Aaron’s Law” just introduced in Congress

Page 30: Data and the Law

Advocacy•Librarians are political animals, especially

around intellectual-property and privacy law.• We have to be!

•Faculty: please make common cause with us. We need more voices!• And humanists tend to be... less helpful than we’d like.

•In the hopper: US copyright “reform,” international treaties, ebook access for the blind, open access to federally-funded research• Twitterers/Tumblarians: watch @ARLPolicy

Page 31: Data and the Law

Exceptions a

nd workar&nds

to copy!ght

http

://w

ww

.flic

kr.c

om/p

hoto

s/84

2991

43@

N00

/249

0004

138/

Page 32: Data and the Law

Copyright permits...

•Copying for certain socially-approved uses• Library preservation and patron service (“section 108”)• Classroom use (“the TEACH Act”)

•Limited copying for other reasons: “fair use” (“section 107”)• Scholarship• Parody/satire• Etc.

Page 33: Data and the Law

Fair use

•Possibly the least-understood concept in copyright!

•An “affirmative defense” in a copyright lawsuit.• Though Kevin Smith notably disagrees with this analysis...

•Principles and guidelines, not hard-and-fast rules.

Page 34: Data and the Law

How to know for sure whether a use is fair, in four simple steps

1.Copy a copyrighted work.2.Get yourself sued by the work’s

legitimate copyright owner.3.Assert fair use as your defense.4.Win the case.

AFAIK, this is the only way.

Page 35: Data and the Law

I’m thinking you think this is a loony way to proceed.

Good. I agree with you.

But that means that what we’re doing is risk management.

Page 36: Data and the Law

Risk is never zero.I wish it could be too.

I’m sorry.

(but if it makes you feel any better, many copyright risks are overblown)

Page 37: Data and the Law

Four-factor fair use test•Character of the use

• “Transformative use” finding favor with judges lately.

•Nature of the work•Amount of the work copied

• often considered as a percentage of the whole• also, “heart of the work” matters

•Effect on the market for that work, if everybody did what you’re doing• part of this is asking whether there IS a market for the work in

the first place!

Page 38: Data and the Law

Community fair-use principles•Started with documentarists

• who couldn’t get insurance for their work because of perceived copyright-infringement risk... which, given litigious idiots who sue over background noise... was a rational stance.

• So they published a “how documentarists use fair use” document.

•Courts took notice. So more such documents have been created.• Academic libraries (ARL), journalism (Center for Social Media)

•There isn’t one for DH. There should be. Talk to your professional organizations!

Page 39: Data and the Law

Creative Commons•What if you WANT people to reuse your stuff?

• You could grant it to the public domain...• ... but then anybody can do anything with it.

•Creative Commons is a middle ground.• Boilerplate language and machine-readable techniques for licensing copyrighted

works to all comers!• Under certain conditions...

•N.b.: CC is predicated on owning a copyright. If you don’t, you can’t use a CC license!• If there’s a copyright, but it’s not yours. (Jointly-held with others is okay.)• If it’s not copyrightable to begin with

Page 40: Data and the Law

CC license provisions•BY: Must attribute to creator.

• On all CC licenses except CC0 (public domain dedication)

•ND: No derivative works.•NC: Non-commercial use only.

• Looks better than it is. Avoid!

•SA: Share-alike• Release new work under the same or more liberal license.

•These can be combined!•CC0: total rights waiver.

• Special resonance for data!

Page 41: Data and the Law

CC and the humanities•So, that thing with the UK history editors...

•University-press editors are often not friends of openness either.

•I really hope 2013 is the year we start calling these people on their, um, errors and misrepresentations.

•DH is in a good position to do that.• It’s more open and public than much of the humanities.• And slightly (only slightly!) less dependent on traditional book

publishing.

Page 42: Data and the Law

Okay, so?•The point of keeping data is to reuse it!

• Okay, there are other points, such as reproducibility and fraud detection. Still. The central reason we’re talking about data so intently is reuse value.

•Data with legal strings attached are harder to reuse. So fewer people reuse them.• Kinda defeats the purpose, no?

•This is why, as a digital humanist, YOU NEED TO CARE about open access and the Creative Commons.• And advocate for them! Again, humanists have lagged here.

Page 43: Data and the Law

Openness

and o"er pol

icies

http

://w

ww

.flic

kr.c

om/p

hoto

s/84

2991

43@

N00

/249

0004

138/

Page 44: Data and the Law

Open movements•There are a lot of them. Don’t mix them up.

• I know, I know, everybody else does. Well, everybody else is stupid! Don’t be stupid!

•Open source SOFTWARE•Open access JOURNAL ARTICLES

• (and occasionally books, but mostly journal articles)

•Open (government) DATA

•Open (notebook) SCIENCE• which is larger than open data! It opens the process of doing the science

as well.

Page 45: Data and the Law

Open access funder mandate: NIH

•Congress: “Hi, NIH. We think taxpayers should be able to read the research they fund!”• NIH: “Cool. We’ll build a repository for it, then.”

•NIH, mid-2000s: “Hi, researchers. Please put your final manuscripts in PubMed Central.”• You can guess how well THAT worked. ~3% deposit rate.

•Congress: “Okay, NIH, voluntary didn’t work; how about mandatory?”• Current deposit rate: about 67%.• But the NIH has only started cracking down on slackers. (Grant cycles are long.)

Page 46: Data and the Law

Keep in mind: universities are also funders!

•DH centers, IT support, and libraries don’t exactly come free!

•But it’s not easy (maybe not possible) for a university to impose an open-access mandate the way a funder can.• Tradition of “faculty governance” forbids.

•Are there university OA mandates? Yes!• But they’re by faculty (usually faculty senates, sometimes individual

schools/departments) for faculty. Always. Anything else, and faculty howl.• Humanists are the loudest howlers. Make of that what you will.

Page 47: Data and the Law

NSF data-management plans

•As of January 2011, all NSF grant proposals must include a two-page data-management plan.• Got no data? Using someone else’s? Say so!• Data sharing required? Not necessarily. Just data management.• Best practices? Standards? Depends on the discipline/directorate, but for the

most part, not yet.• Digital data only? Absolutely not! If you’re taking physical samples, you need

to talk about them too.

•Why am I talking about this here and now?• Because the NEH’s Office of Digital Programs has a similar policy!• Because the OSTP Memo bids fair to extend this to many more agencies!

Page 48: Data and the Law

Now: OSTP Memo

•Office of Science and Technology Policy (part of the executive branch)

•Big federal funders have until the end of July to explain how they’ll achieve open access AND open data for research they fund.• The NEH is not subject to the memo (budget too small), but they have

announced they plan to comply anyway.

•Pass the popcorn. This should be good.

Page 49: Data and the Law

How can your library help?•Getting the word out•Offering consultation services

• often in collaboration with other campus units, e.g. IT• usually includes an informational website

•Offering institutional repositories as data home• This is... problematic, but it’s something.

•Training•In a very few cases: planning for and working

toward greater involvement• e.g. Purdue, Penn State, California Digital Library, University of Prince Edward Island

Page 50: Data and the Law

Local data

policies

http

://w

ww

.flic

kr.c

om/p

hoto

s/84

2991

43@

N00

/249

0004

138/

Page 51: Data and the Law

Who has policies?

Photo: “Who Am I?” Ahmad Hammoudhttp://www.flickr.com/photos/ahmadhammoudphotography/5212868148/ CC-BY

•Non-profit grant funders, now and then

•The federal government, more and more often

•State governments, in limited situations

•Your institution, sometimes•Journals, sometimes

•(not usually in the humanities)

Page 52: Data and the Law

What might a data policy cover?

•Who “owns” data•How long you need to keep data•When and with whom you need to share

data (or are forbidden from doing so)•What data you need to keep secure,

and (sometimes) standards for doing so•What happens to “your” data when you

graduate or change jobs or institutions•PAY CLOSE ATTENTION TO THIS, graduate students! This can bite you!

Photo: “Martha” Ford Buchananhttp://www.flickr.com/photos/fordbuchanan/4022157306/ CC-BY

Page 53: Data and the Law

Institutional policies

•Not all institutions have them.•Not all institutions enforce them.

•But if you get in trouble, the policy will be used to throw the book at you.

•FIND OUT. Wherever you go, whatever you do, FIND OUT.

Photo: “Rowlandsway House, Wythenshawe” Gene Hunthttp://www.flickr.com/photos/raver_mikey/467480300/ CC-BY

Page 54: Data and the Law

Open data

e"ics challen(s ht

tp:/

/ww

w.fl

ickr

.com

/pho

tos/

8429

9143

@N

00/2

4900

0413

8/

Page 55: Data and the Law

FERPA•If you want student records for your research,

plan on getting student or parental consent (depending on student’s age).• Caveat: if you’re doing research FOR THE SCHOOL ITSELF, you’re probably

off the hook, but you can’t use the data for anything else.

•FERPA does not cover statistical data compilations in which students are not individually identifiable.

•Graded assignments are covered (because the grade is protected). An assignment printout with no grade? Not covered!

Page 56: Data and the Law

IRB data questions

•Institutional Review Board: ethics watchdog for research• Science has a pretty exploitative history. IRBs are designed to prevent harm to

study subjects.

•Still working to catch up, mentally, to the realities of e.g. Web research, open data

•Consider referring ethics questions about data sharing to the IRB. They’re the last word.• Though realize you may have to educate them! IRBs are known to be...

overzealous, many places.

Page 57: Data and the Law

“Extra risk”•Key variable for IRBs is “risk to participants.”•What are the additional risks of data

retention and sharing?• Is Big Brother coming to get your study subjects?• Added deanonymization/reidentification risk? Cracking risk?• “If it’s on the open Internet, it’s fair game.” Well...

•IRBs not entirely up on this just now. THEY WILL LEARN.

•And more humanists, especially digital humanists, are doing work that falls under this kind of oversight.

Page 58: Data and the Law

Thanks!

•Copyright 2011 by Dorothea Salo.

•This lecture and slide deck are licensed under a Creative Commons Attribution 3.0 United States License.