CLARIN WP7: Intellectual Property Rights and other Legal Matters Kimmo Koskenniemi 18 February 2008 University of Helsinki (UHEL) Department of General Linguistics Faculty of Arts
Dec 31, 2015
CLARIN WP7: Intellectual Property Rights and other Legal Matters
Kimmo Koskenniemi
18 February 2008
University of Helsinki (UHEL)
Department of General Linguistics
Faculty of Arts
WP7: Overall goals
WP7 deals with copyrights, patents, licensing,
authorization and authentication which are needed for the
proper use of language resources. Whereas WP5 works with the standards and taxonomies
of the language resources and tools, WP7 deals with the
copyrights and licenses of these. The authorization and authentication patterns of WP7 are
shared with WP2 where their technical implementation is
treated.
User
LRCollector 1
LRCollector k
Author/Publisher 1
Author/Publisher k
Author/Publisher n
ComputingCentre m
ComputingCentre 1
TrustedOrganization
Copyright and licenses (WG2A)
A copyright restricts the public showing and making
copies of a work (such as a text, spoken material, etc.) A copyright allows citations and making a few copies for
one's personal use (with some restrictions). A license is an agreement which modifies the restrictions
and permissions of using a work. Licenses of language materials allow a little beyond what
the copyright would permit anyway (i.e. making a copy or
showing the text for selected people) Licenses mostly restrict and narrow down the rights a
user would otherwise have (according to the copyright
legislation).
Questions about copyright (WG2A)
An author transfers all or a part of his/her copyrights to a
publisher (but does not necessarily know which). Some
rights can't be transferred and some can. Transferring all rights need not include all rights. Are individuals allowed to make personal electronic
copies of works? What is "personal use"? What is a derivative of a work and how far is the copyright
inherited? A full concordance? List of words with
frequencies? Common words in several corpora? White/black/grey areas where grey can turn into black or
grey at a trial in court.
Migration of licenses for materials (WG2A)
Migrate old licenses into new standard ones by finding a
standard license which grants less rights and imposes
more restrictions and use that. If the user believes that he/she has more restrictions and
less rights, no harm is done.
Automating the granting of access (WG2A)
Easy to grant access to whole groups if that can is
allowed. The granting of access to individuals should be based on
features which can be certified at the site where the user
works. If we trust that organization, the granting of access
con be automated. This puts special requirements on what facts and
commitments the prospective user must sign him/herself
and what the home organization must certify. (Avoid
criteria which would need individual checking.)
To do in WG 2A
Build a network of contacts, one in each country. Collect an inventory of existing licenses (and standard
licenses). Information leaflets prepared for publishers. Collect information on differences in national
interpretations of copyright legislation. Invite various parties involved to the WG: publishers,
organizations for authors, potential "trusted organizations" Find out what kind of information can be collected for the
applications and what will satisfy the copyright owners. Sketch a set of a few model licenses. Fair use of materials for research purposes also in
Europe (as in USA).
Retaining trust at distance (WG 4)
Study techniques for remote authentication and
authorization from the juridical and contractual point of
view. What could be done using electronic signatures and when
can they be used? Set up rules and model contracts which the computing
centres, users, material collectors, software providers,
authors and publishers should obey.
Trust and credibility (WG4)
The publisher/author must trust the collector (license). The collector must trust the trusted organizations and that
they correctly authenticate their staff or student users and
give correct and reliable information about them (a
contract). The collector must trust the centres where the materials
are deposited so that they only allow authorized users to
access the materials (agreement for depositing and
services).
To do in WG4
Collect a set of relevant contact persons. Collect existing agreements (between computing centres,
collectors etc.) Make an inventory of existing practices of authentication
and authorization in universities and other organizations.
(How much the organizations themselves trust their
electronic authentication.)
Language tools for CLARIN (WG2B)
CLARIN needs human language technologies for a wide
array of languages, cf. also WP5 and WP2. Normalizing, parsing and processing of the materials
should be multilingual (maybe some 100 languages). Programs may have licenses which restrict their use
together with other programs or materials.
Open source tool licenses (WG2B)
Tools under open source licenses would be useful for
CLARIN software as they guarantee the freedom of
maintaining and developing them as needed. The use of open source programs is free (they can be
applied to any data and be used even commercially). CLARIN will follow open source principles in its own
software where possible. Open source programs can usually, but not always, be
combined to create larger systems. WP7 studies existing open source software licenses and
produces recommendations for open source licenses to
be used for CLARIN software.
Proprietary tool licenses (WG2B)
Proprietary programs have individual licenses which differ
from each other. Some programs cannot be combined with other programs
because of (often unintended) contradicting clauses in
their licenses. Commercial software cannot be combined e.g. with free
software under GNU GPL license. Model contracts for commercial software to be included in
CLARIN services .
language councils,research centres
research projectsSMEs for speech
technology &dialogue applications
SMEs forinformation retrieval
universities
SMEs for authoringtool productstext
corpora
lexicaas DBs
speechcorpora
tree banksword nets
open src LR tools
ELDA and LDC (WG3)
Become acquainted with the operations and principles of
ELDA (agreements, rules and practices). WP7 will define the relation between ELRA/ELDA and
CLARIN, and the cooperation with them.
IPR legislation (WG2C)
Collect members with legal expertise from several
countries. Study European IPR legislation and practices Possibly produce recommendations for alterations in the
legislation.
Partners in WP7
UHEL: University of Helsinki (24 pm) ELDA: European Language Resources Distribution
Agency S.A. (2 pm) UTU: Eberhard Karls Universität Tübingen (2 pm) INL: Instituut voor Nederlandse Lexicologie (2 pm) ILSP: Institute for Language and Speech Processing -
Athena Research and Innovation Centre in Information,
Communication and Knowledge Technologies (2 pm)
Deliverables of WP7
D7S-2.1 A report including Model Licensing Templates
and Authorization and Authentication Scheme (20 pm,
month 36) D7S-3.1 Collaboration Plan between CLARIN and
external services (4 pm, month 24) D7S-4.1 Set of Federation Agreements for CLARIN
centres (8 pm, month 36)
Ethical rules (no group yet)
WP7 also considers the ethical rules and
recommendations related to the language resources. Recordings and other personal documents need special
attention as they contain sensitive data.
Commercial or free? (No group yet)
Traditionally, research use of LRs has been free of
charges but there are exceptions. Some services, e.g. the Russian Integrum offers almost
all published Russian newspapers and periodicals in a
commercial on line retrieval system which is also used by
researchers (but it costs). Integrum probably gets its data free but it gives a royalty
to the publishers according to the use of their data. In this
way, the coverage of Integrum is exceptionally good. WP7 will study the option of including commercial
resources in the CLARIN scheme.