Top Banner
LIS654 lecture 2 history Thomas Krichel 2011-09-14
36

LIS65 4 lecture 2 history

Feb 22, 2016

Download

Documents

metta

LIS65 4 lecture 2 history. Thomas Krichel 2011-09-14. contents. some known old thinkers Vannevar Bush Joseph Carl Robnett Licklider, aka “Lick” the birth of the repository. background. Vannevar Bush (1890—1974) directed the US office of Science Research and Development during WW2. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: LIS65 4  lecture  2 history

LIS654 lecture 2

history

Thomas Krichel2011-09-14

Page 2: LIS65 4  lecture  2 history

contents

• some known old thinkers– Vannevar Bush– Joseph Carl Robnett Licklider, aka “Lick”

• the birth of the repository

Page 3: LIS65 4  lecture  2 history

background

• Vannevar Bush (1890—1974) directed the US office of Science Research and Development during WW2.

• As the war ended he saw two problems– how to make the war time scientific reports

available– find a new challenge for the scientists

• He proposed a solution in “As we may think”.

Page 4: LIS65 4  lecture  2 history

As we may think

• Vannevar Bush (1890—1974) wrote his famous essay in 1945.

• It remains to date one of the most frequently cited papers in Library and Information Science.

• I think this fame is somewhat undeserved.

Page 5: LIS65 4  lecture  2 history

the scientific record

• As scientists do more work, the “record” extends. This is good.

• Recent advances in microfilm also made is possible to store more of the record in microfilm.

• But with much research and increased specialization, “significant attainments become lost in the mass of the inconsequential”.

Page 6: LIS65 4  lecture  2 history

the memex

• The memex was a proposed desktop machine that would store millions of books in microfilm.

• It would have a mechanism that would allow any known item from the collection rapidly.

• But the problem is what items to look at?

Page 7: LIS65 4  lecture  2 history

organizing the collection

• Still today collections are organized hierarchically from class to subclass. Think of standard classification schemes.

• Or in a list of words, in a mechanical form from letter to letter.

• Bush rejected this, claiming that the brain does not work in that way.

Page 8: LIS65 4  lecture  2 history

the brain, by Bush

• Bush thought that the brain works by association.

• “With one item in its grasp, [the brain] snaps instantly to the next that is suggested by association of thought”.

• This is done “in accordance with some intricate web of trails carried by the cells of the brain.”

Page 9: LIS65 4  lecture  2 history

memex as a brain

• Every time a document is added to the memex it is given an identifier.

• Every time an item is consulted the user can associate with it other items. These associations are recorded.

• Trails of associations can be annotated and copied.

• Selection by association replaces indexing.

Page 10: LIS65 4  lecture  2 history

sharing

• An annotated trail between items can form a new item. That item can be shared.

• Bush envisioned that there would be a way for each memex to learn from all other memexes.

• Memex users would improve their thinking ability by its use.

• This would greatly increase the speed of scientific discoveries.

Page 11: LIS65 4  lecture  2 history

implementation

• There is no evidence that anything like the memex was ever built.

• Microfilm was replaced by digitization.• But the idea of associative trails or associative

indexing has something to do with the hypermedia.

• The later goes back to Ted Nelson.

Page 12: LIS65 4  lecture  2 history

Licklider

• Joseph Carl Robnett Licklider (1915—1990) trained as a mathematician and psychologist and worked mainly at the MIT.

• The Council of Library Resources got funding from the Ford Foundation to examine how technology could help libraries.

• Work was undertaken by Bolt, Beranek and Newman (BBN) of later ARPA fame

Page 13: LIS65 4  lecture  2 history

the basic idea

• The idea was that one could store all knowledge in a single or distributed machine.

• How this should be done?• Well first Lick estimated the corpus would be

10^14 bytes by the year 2000.• That’s about 500 20TB systems. • It could a be a central system with thin clients.

Page 14: LIS65 4  lecture  2 history

the system

• The system was call “procognitive” meaning for the advancement of knowledge.

• It would not be based on documents, metadata and retrieval.

• It would process information into knowledge and questions into answers.

• Users transmit their knowledge to the system.

Page 15: LIS65 4  lecture  2 history

information to knowledge

• To see how information can be processed into knowledge, Lick, looked at the human brain. He had studied cat brains in his PhD work.

• If it is possible to the process the body of information into knowledge structures, then questions can be answered by knowledge rather than be documents.

Page 16: LIS65 4  lecture  2 history

Lick on the brain

• The brain receives stimuli and stores representations of them.

• The brain finds answers to question by processing stored memory on the basis of “schemata”, which are ways in which stored stimuli representations can be processed.

Page 17: LIS65 4  lecture  2 history

human processing• Lick understood that current and foreseeable

technology would not allow processing of documents into knowledge.

• This would be the job of set of librarian called “procognitive system specialists”.

• The would encode contents of documents in a knowledge language.

• They would watch for ambiguity warnings.• Users would also provide feedback.

Page 18: LIS65 4  lecture  2 history

encoding

• Surprisingly Lick still imagined the procognitive system be based on natural language.

• The hope was that artificial intelligence (AI) methods would be developed to extract information from documents.

• That hope seemed justified in the 60s when AI was in its infancy.

Page 19: LIS65 4  lecture  2 history

steps to implementation

• The first attempts, in the 60s, tried to find the citation string in a database of citations.

• Thus this was more information retrieval on a small set of metadata than actual digital library work.

• Librarians preparing bibliographies for researchers were the prime users.

Page 20: LIS65 4  lecture  2 history

into 80s

• In the 80s the personal computer “came back”.

• Searching could be done of the full-texts of document.

• Browsing became available.

Page 21: LIS65 4  lecture  2 history

90s

• In the 90s the Internet and the search engine came along.

• Initially search engines followed standard information retrieval principles.

• My first work, about 1993, was based on gopher access and WAIS indexing.

Page 22: LIS65 4  lecture  2 history

the semantic web• The semantic web is the actual successor to

Lick’s vision.• It’s still not done.• I speculate it will not be done for a long time.• The reason is that while Lick thought

Psychology and Computers, he did not think through the economics of operating such system as the ones that he proposed.

• He also had too optimistic a vision about AI.

Page 23: LIS65 4  lecture  2 history

back in the trenches

• As we have seen early digital library visions have been inspired by the concern of access to scientific documents.

• The academic digital library was synonymous with the digital library.

• So all the progress was there, and pretty much is.

Page 24: LIS65 4  lecture  2 history

academic documents

• There are basically two types of academic documents.

• There are academic books and academic articles or papers.

• Both of them have been treated in different ways in the past, and continue to be treated differently at this time, maybe not for long.

Page 25: LIS65 4  lecture  2 history

academic books (monographs)

• Books are (were) purchased by libraries. • They were cataloged into the local integrated

library system– locally or– through shared cataloging

Page 26: LIS65 4  lecture  2 history

academic papers

• Most of them published in serials.• Libraries never catalogued them locally.• They relied on third party services to provide

abstracting and indexing services for them.

Page 27: LIS65 4  lecture  2 history

publishing academic papers

• Publishing academic papers go through a process of peer review.

• Papers are written for free by some academics.

• They are being reviewed for free by other academics.

• The profits from publishing go to publisher. Academic publishing is very profitable.

Page 28: LIS65 4  lecture  2 history

non-formal publication

• Some academic disciplines have a tradition of informal publication of papers that have not peer-reviewed.

• These are – mathematicians and physicists have preprints.– computer scientists and economists have working

papers.

Page 29: LIS65 4  lecture  2 history

preprints vs working papers• Preprints were sent by academics to

colleagues.• Working papers are issued by departments

and sent other department by an exchange agreement.

• Whatever the mode of working, non-formal publication channels enabled librarians to build really digital libraries.

• Actually they were more built by their users.

Page 30: LIS65 4  lecture  2 history

xxx.lanl.gov• This was/is a preprint server started by Paul

Ginsparg at Los Alamos National Archives.• It has been popular with physicists and

mathematicians. • It’s coverage with sub-disciplines varies.• It became arXiv.org.• It moved Cornell University in 2001.• It is now run by Cornell University libraries.

Page 31: LIS65 4  lecture  2 history

NCSTRL• was the network computer science technical

report library, a DARPA/NSF funded project that built an infrastructure for publishing computer science working papers.

• Starting in 1993, it was built on a formal protocol called Dienst. This enables local and remote services.

• Implementation software was deployed at participating institutions.

• Collapsed completely when funding was gone.

Page 32: LIS65 4  lecture  2 history

RePEc

• is a federated system based on metadata (ReDIF) and a transport protocol (Guildford Protocol), both written by yours truly.

• It can be run on a standard ftp or http software.

• RePEc archives don’t offer local services to end users.

Page 33: LIS65 4  lecture  2 history

UPSPROTO• In 2000, Herbert Van de Sompel started work

to build a prototype system to provide the existing discipline-based digital libraries.

• The experience lead to the formation of a working group that created an interoperabilty protocol called the Open Archives Protocol for Public Metadata Harvesting. (OAI-PMH).

• I was part of that group.

Page 34: LIS65 4  lecture  2 history

repositories

• OAI-PMH has been so widely implemented in repositories that we can say that a repository is a collection of documents on a server that implements this protocol.

• The is no official lists, but counts for institutional repositories now go over 2000.

Page 35: LIS65 4  lecture  2 history

institutional repositories• The initial purpose of the institutional

repositories has been to make institutional research papers available.

• This would create open access to research papers.

• But the success of deposit of real scientific work has been muted.

• In the meantime there are other type of contents in IR.

Page 36: LIS65 4  lecture  2 history

http://openlib.org/home/krichel

Please shutdown the computers whenyou are done.

Thank you for your attention!