Top Banner
Copyright Reform and Open Data Peter Murray-Rust contentmine.org Improving Public Service Delivery through Open Data, InsideGovernment, London, 2015-03-19 Copyright restricts innovation Governments want to reform it to generate wealth Content/text mining can generate wealth However: massive concerted opposition
21
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Copyright Reform and Open Data

Copyright Reform and Open Data

Peter Murray-Rustcontentmine.org

Improving Public Service Delivery through Open Data, InsideGovernment,London, 2015-03-19

• Copyright restricts innovation• Governments want to reform it to

generate wealth • Content/text mining can generate wealth• However: massive concerted opposition

Page 2: Copyright Reform and Open Data

The Right to Read is the Right to Mine

http://contentmine.org

Page 3: Copyright Reform and Open Data

Scientific and Medical publication (STM)[+]

• World Citizens pay $400,000,000,000… • … for research in 1,500,000 articles …• … cost $300,000 each to create …• … $7000 each to “publish” [*]… • … $10,000,000,000 from academic libraries …• … to “publishers” who forbid access to 99.9% of citizens of

the world …• 85% of medical research is wasted (not published, badly

conceived, duplicated, …)

[+] Figures probably +- 50 %[*] arXiV preprint server costs $7 USD per paper

Page 4: Copyright Reform and Open Data

http://chemicaltagger.ch.cam.ac.uk/

• Typical

Typical chemical synthesis

Ca 10 million of these paragraphs/YEAR ; Chemical information market >= 1 Billion USD

Page 5: Copyright Reform and Open Data

Open Content Mining of FACTs

Machines can interpret chemical reactions

We have done 500,000 patents. There are > 3,000,000 reactions/year. Added value > 1B Eur.

Page 6: Copyright Reform and Open Data

Typical Clinical Trial

Effect of drug over time. It takes ½ day to extract data by hand. contentmine.org can do it in 1 second, BUT will technically break copyright.

Page 7: Copyright Reform and Open Data

Prof. Ian Hargreaves (2011): "David Cameron's exam question”: "Could it be true that laws designed more than three centuries ago with the express purpose of creating economic incentives for innovation by protecting creators' rights are today obstructing innovation and economic growth?” “yes. We have found that the UK's intellectual property framework, especially with regard to copyright, is falling behind what is needed.” "Digital

Opportunity" by Prof Ian Hargreaves - http://www.ipo.gov.uk/ipreview.htm. Licensed under CC BY 3.0 via Wikipedia -https://en.wikipedia.org/wiki/File:Digital_Opportunity.jpg#/media/File:Digital_Opportunity.jpg

Page 8: Copyright Reform and Open Data

http://www.jisc.ac.uk/reports/value-and-benefits-of-text-mining

In order to be 'mined', text must be accessed, copied, analysed, annotated and related to existing information and understanding. Even if the user has access rights to the material, making annotated copies can be illegal under current copyright law without the permission of the copyright holder.

there is a new conundrum: the market intervention of copyright – originally intended to protect creative producers – may be inhibiting new knowledge discovery and innovation.

It is often unclear whether text mining is a permissible use

Page 9: Copyright Reform and Open Data

• “creative use of these large data sets in the US health care sector could generate more than $300bn in value per annum” [MGI, McKinsey]

• Gartner Inc. has identified 'Big Data' and 'Next-Generation Analytics' as two of the 'Top 10 Strategic Technologies' for 2012.

• Given the volume of text generated by business, academic and social activities – in for example competitor reports, research publications or customer opinions on social networking sites – text mining is, however, highly important. [JISC]

• there are some tasks that simply could not be achieved without using text mining. For example, a major pharmaceutical company used text mining tools to evaluate 50,000 patents in 18 months. This would have taken 50 person years to achieve manually, meaning that it would not even have been contemplated. [JISC]

“Big Data – and Analytics (ContentMining)

Page 10: Copyright Reform and Open Data

Hargreaves’ Recommendations 2011[*]

• Government should deliver copyright exceptions at national level to realise all the opportunities within the EU framework, including format shifting, parody, non-commercial research, and library archiving.

• The UK should also promote at EU level an exception to support text and data analytics. The UK should give a lead at EU level to develop a further copyright exception designed to build into the EU framework adaptability to new technologies.

[*] Now UK law (2014-06 and 2014-10)

Page 11: Copyright Reform and Open Data

2014-10 this has now been overturned by a higher court

Page 12: Copyright Reform and Open Data

PUBLISHER TDM LICENCE INITIATIVES GENERALLY DO NOT HELP

• Publishers have started offering their own TDM licences and policies• Their licences often impose unfair (and in the case of the UK, unenforceable)

constraints on researchers’ freedom to exploit TDM, e.g., requiring users to employ publisher’s API, putting unnecessary restrictions on how much can be copied, or how fast it can be copied.

• Why “unenforceable”? Because, as noted earlier, UK law specifically states that any contract or licence term that prevents anyone from doing TDM in the manner prescribed in the new exception shall be deemed null and void.

• Really need a test case on these attempted restrictions.• Springer and Royal Society offer generous TDM provisions. • So why are so many publishers offering restrictive licences in the UK? Maybe

they hope licensees are ignorant of the strength of the new law, or the publishers in fact don’t know about it. So they are either deliberately misleading, or ignorant

Prof Charles Oppenheim and contentmine.org

Page 13: Copyright Reform and Open Data

As revealed by the 'public domain calculator’ established by Europeana, there is a staggering complexity in the determination of the different copyright term lengths in member states, some of them requiring knowledge about the circumstances of the author's death or about the situation of the author's heirs at the time of her death –

Public domain material is frequently “re-copyrighted” (Copyfraud)

Which English storybook will never be public domain[1]?

[1]Peter Pan

The Public domain

Page 14: Copyright Reform and Open Data

The Right to Read is The Right To Mine

PMR in 2012: http://blog.okfn.org/2012/06/01/the-right-to-read-is-the-right-to-mine/

Page 15: Copyright Reform and Open Data

The Hague Declaration[*]

• Intellectual property was not designed to regulate the free flow of facts, data and ideas, nor should it.

• Freedom to analyse and pursue intellectual curiosity without fear of monitoring or repercussions must not be eroded in the digital environment.

• Ethics around the use of data and content mining will need to continue to evolve in response to changing technology

• Open access as “a comprehensive source of human knowledge and cultural heritage” must be pursued in order to increase the uptake of and equality of access to content mining technologies.

• Innovation and commercial research based on the use of facts, data, and ideas should not be restricted by intellectual property law.

• [*] Drafted 2014-12 , convened by LIBER (Association of European Research Libraries)

Page 16: Copyright Reform and Open Data

The Hague Declaration on Knowledge Discovery in the Digital Age 2015

• The potential benefits of Text and Data Mining are vast and include:•

• Addressing grand challenges such as climate change and global epidemics• Improving population health, wealth and development• Creating new jobs and employment• Exponentially increasing the speed and progress of science through new

insights and greater efficiency of research• Increasing transparency of Governments and their actions• Fostering innovation and collaboration and boosting the impact of open

science• Creating tools for education and research• Providing new and richer cultural insights• Speeding economic and social development in all parts of the globe

• LIBER – Association of European Research Libraries

Page 17: Copyright Reform and Open Data

Pirate Party, MEP

Page 18: Copyright Reform and Open Data
Page 19: Copyright Reform and Open Data

Some of Reda’s 25 recommendations

• introduction of a single European Copyright Title based on Article 118 TFEU that would apply directly and uniformly across the Union,

• the EU legislator should further lower the barriers for re-use of public sector information by exempting works produced by the public sector - within the political, legal and administrative process -from copyright protection;

• the ability to freely link from one resource to another is one of the fundamental building blocks of the Internet;

• Emphasises that the exception for caricature, parody and pastiche should apply regardless of the purpose of the parodic use;

• Stresses the need to enable automated analytical techniques for text and data (e.g. 'text and data mining') for all purposes, provided that the permission to read the work has been acquired;

Page 20: Copyright Reform and Open Data

Some Contacts/references

• JISC report on benefits of TDM - D. McDonald and U. Kelly, The value and benefits of text mining (2012), http://www.jisc.ac.uk/reports/value-and-benefits-of-text-mining

• Official guidance on the new UK copyright exception for TDM -https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/315014/copyright-guidance-research.pdf

• Excellent general overview of the change to UK law and its implications - http://copyrightuser.org/topics/text-and-data-mining/ - provides link to the precise wording in the law

• http://contentmine.org

Page 21: Copyright Reform and Open Data

THE NEW UK LAW

• Came into force in June 2014

• Specific exception to copyright for TDM

• From the official guidance to the new exception issued by HMG: “Text and data mining usually requires copying of the work to be analysed. An exception to copyright exists which allows researchers to make copies of any copyright material for the purpose of computational analysis if they already have the right to read the work (that is, they have ‘lawful access’ to the work). This exception only permits the making of copies for the purpose of text and data mining for non-commercial research. Publishers and content providers will be able to apply reasonable measures to maintain their network security or stability but these measures should not prevent or unreasonably restrict researcher’s ability to text and data mine. Contract terms that stop researchers making copies to carry out text and data mining will be unenforceable.”

• Does not apply to database right. Interesting problem if a particular database enjoys both copyright and database right!

• UK researchers do not have to ask for permission, pay fees, etc., to do such TDM

• What is, or is not “non-commercial”? Not always clear! The question must be asked at the time the TDM was undertaken, so unexpected commercial benefits at the end of the project as OK, so long as at the time the intent was non-commercial

• “Lawful access” usually means licensed content, whether OA or a subscription to the materials, but also includes lawful access to printed works held in, say, a library

• Prof Charles Oppenheim and contentmine.org