Open Archives and Open Libraries Thomas Krichel 2003-06-22
Mar 27, 2015
Open Archives and Open Libraries
Thomas Krichel
2003-06-22
who am I?
• I was an economist.
• I was a leisure digital librarian.– NetEc 1993– RePEc 1997
• I am a geek.
• I am a visionary.– but not St. John the Baptist
Who is he?
St. IGNUicus
• A humoristic creation of Richard M. Stallman (RMS)
• RMS is the father of the free software movement– a geek– a visionary
• St. IGNUicus shows an emphasis on the moral case for free software.
moral case and business case
• Other folks in the free software movement stress the need to demonstrate the business case for free software.
• They tend to avoid the word free, because free can mean cheap and cheap can mean bad.
• They use the term "open source software".
RMS and us
• Some of us are already developing and using free software.
• I say: we librarians need to learn more from the free software movement.
• We need to make the concepts coming of free software more a part of our business.
• Let us look at a key concept: free software.
free software according to RMS• Free software comes with four freedoms
– The freedom to run the software, for any purpose
– The freedom to study how the program works, and adapt it to your needs
– The freedom to redistribute copies so you can help your neighbor
– The freedom to improve the program, and release your improvements to the public, so that the whole community benefits
free speech and free beer
• Free software does not mean $0
• The term "free" in free software should be interpreted as "freedom to do things with it".
what has this to do with us?
• Just replace free software with free information.
• Libraries are about free information.• But the analogy is not quite as simple.
– When we talk about free information, we usually mean things that we can freely read (download…). free as in: $0
– We do not usually mean free information as information we are free to do things with. Free as in freedom.
moral and business
• There is a moral case for free information.– We rely on it.
• There is a business case for free information.– We need to make our own.
we rely on the moral case
• The citizen should be informed…
• Individuals in the organization should have free access…
• This is how we justify resources given to us.
• Often, members of the community who pay get privileged access.
from moral case to business case
• To form the business case for free information, think of "free information" as "freedom to do things" rather than $0.
• Thus libraries can make a crucial business case for them as agents who transform information.
• Recall that there are whole industries out there that produces free information.
was this seminar not about open archives?
• Open archives are crucial tools for the development of libraries that transform freely available information.
• By analogy to the term "open archives" I will say that the "libraries that transform freely available information" are "open libraries". These are usually digital libraries.
what are open archives
• They are machines that may or may not store items.
• Data or metadata records about these items is being made available through a machine interface.
• One possible interface is defined OAI protocol for metadata harvesting. In the following I will be assuming that any open archive runs that protocol.
why do open archives matter
• Open archives are specifically set up to allow machine readable access to information.
• Thus presumably there is a permission to further process the information. "cogito, ergo sum" logic.
• You may think about the act to establish an open archive as an early 3rd millennium digital ritual.
open archives and open libraries
• In the early history of open archives, their main use is as metadata repositories.
• We can build a simple open library by aggregating contents from many open archives.
• But we can do more.
what do open libraries do?
• Identify records found in open archives.
• Relate identified records in open archives with each other.
• These actions require human control.
example from RePEc
• There are 300+ archives that contribute to RePEc data about publications.
• That data has author name strings.• A special open archive furnishes access
control records. These records lists author names and paper record identifiers of the papers the author wrote.
• This is classic access control, but done by the authors.
• An open archive exports the author data…
why do authors register?
• Authors perceive the registration as a way to achieve common advertising for their papers.
• Author records are used to aggregate usage logs across RePEc user services for all papers of an author.
• Open archives at the RePEc user services export usage data.
open library idea: serials data
• Serial level information is a crucial component of academic library data.
• Idea: build and maintain free serial records.
• Two ways to build:– Use volunteers and collect in a decentralized
way.– Make an expensive central collection,
disseminate well, charge $$$ for record changes later.
another open library idea: law
• Much of the legal texts are de jure free.
• De facto there are two companies who have comprehensive collections and charge a lot of money for the free information bundled with proprietary information.
• Our moral case calls for a replacement!
(it will also create jobs for us)
free legal open library
• Have all laws and cases– online (open archives)– as text (open archives)– identified (open library)
• Have citation metadata, so that legal citations can verified be while composing case data.
• Registration procedure to verify the integrity of data.
open library idea II: drugs
• Collect data on the composition of all drugs– drugs composition reported by drug
companies, using open archives– drug components documented by the
governments, using an open archive
• Open library brings the two together!
Am I crazy? • Money does not make the world go round. Ideas
do.• When RMS proposed a free replacement for
UNIX in the early 80s, most people dismissed the idea.
• Today it is reality! • Similarly, when I started to work on RePEc a
totally free and improved A&I dataset in 1993, nobody gave it a high probability to succeed.
• It will be a reality!
obstacles to open archives & open libraries
• lack of imagination• lack of entrepreneurship• inability to form alliances• user-centered thinking• document-centered thinking• technical competence required
– OAI PMH– XML and XML Schema– Unicode
• the "C" word
what I do for open libraries
• Create an open library for library science: the rclis (reckless) dataset.
• Create a supporting organization:
the open library society.
• co-workers welcome!
conclusion
• The open library is a business idea to move free information powered by libraries from the paper to the digital world.
• Open archives are a sine qua non component of the business idea.
• Open archives furnish information that we are free to further process (as opposed to consume).
http://openlib.org/home/krichel
Thank you for your attention!