Digital Libraries
Post on 18-Jan-2016
52 Views
Preview:
DESCRIPTION
Transcript
Digital Libraries
Nick Narcise
April 4th 2006
What is a Digital Library?
What is a Digital Library?
Definition from Wikipedia
A digital library is a library in which a significant proportion of the resources are available in machine-readable format (as opposed to print or microform), accessible by means of computers.
The digital content may be locally held or accessed remotely via computer networks.
D-Lib Magazine
What Do You Do with a Million Books?
Gregory CraneTufts UniversityD-Lib Magazine
March 2006
Volume 12 Number 3
ISSN 1082-9873
http://www.dlib.org/dlib/march06/crane/03crane.html
Main Focus
The ability to extract from the stored record of humanity useful information in an actionable format for any given human being of any culture at any time and in any place
Reduce the tangle of text mining, analysis, and searching technologies
converting analog source to text
translating one language to another
Transform raw text into data
How is a Library digitized?
The process of digitizing a library began with the catalog, moved to periodicalindexes and abstracting services, next to
periodicals and large reference works and
finally book publishing.
Some of the largest and most successful digital
libraries are Project Gutenberg, ibiblio and the
Internet Archive.
Optical Character Recognition
From Wikipedia, the free encyclopedia
Optical character recognition, usually abbreviated to OCR, involves computer software designed to translate images of typewritten text (usually captured by a scanner) into machine-editable text, or to translate pictures of characters into a standard encoding scheme representing them in (ASCII or Unicode).
Problems with OCR
May have errors Useless as a knowledge base Human beings are still much better at
reading and interpreting the contents of page images than machines.
Text, Information, Knowledge and the Evolving Record of Humanity
Gregory Crane and Alison JonesTufts UniversityD-Lib MagazineMarch 2006Volume 12 Number 3
ISSN 1082-9873
http://www.dlib.org/dlib/march06/jones/03jones.html
C. Montgomery Burns: "I'd like to send this letter to the Prussian consulate in Siam by aeromail. Am I too late for the 4:30 autogyro?"
Clerk: "Uhhh, I better look in the manual ..."
Burns: "The ignorance! ..."
Clerk: "This book must be out of date – I don't see 'Prussia,' 'Siam' or 'autogyro.'"
From "Mother Simpson," The Simpsons Television Show, Episode 3F06
Digital Reference Materials
Thesaurus of Geographic Names (TGN) Includes names and other information about places such as cities, counties, nations and
their associated physical features like mountains, coasts and rivers. Other information related to history, population, culture, art and architecture is included.
TGN can associate the obsolete name Siam with the nation of Thailand (tgn,1000142) – but also with towns named Siam in Iowa (tgn,2035651), Tennessee (tgn,2101519), and Ohio (tgn,2662003). Prussia appears but as a general region (tgn,7016786), with no indication when or if it was a sovereign nation.
Alexandria Digital Library (ADL) represents a sophisticated framework with which to create such resources: places can
be associated with temporal information about their foundation (e.g., Washington, DC, founded on 16 July 1790),
Consider the sentence
“The current price of tea in China is 35
cents per pound."
The idea is that a digital library could
plot the prices of various commodities in different markets over time,
plot the various lifetimes of individuals, or extract and classify many events would be very useful
Digital Reference Materials
Carefully transcribed primary sources<l n="22">Forte fuit iuxta tumulus, quo cornea summo</l>
Gazetteers and semi-structured text sources<div 2 type=entry><head>AARONSBURG</head><p>P v., Hains t., Centre co., Pa. It is at the eastern extremity of Penn's valley, near Penn's creek, 32 m. Bellefonte, 89 N.W. Harrisburg. 181 W. It contains a lutheran church, two stores, and 450 inhab
Citation-based authority lists<div1 type="entry" id="abdera"><head>Abdera</head><div2 type="subentry" id="abdera-1"><head>Abdera, city of Thrace</head><div3 type="index"><list type="index"><item><bibl n="Paus. 6.5.4">Paus. 6.5.4</bibl>, <bibl n="Paus. 6.14.12">Paus. 6.14.12</bibl></item><item>a town of Thrace on the Nestus: <bibl n="Hdt. 1.168">Hdt. 1.168</bibl>, <bibl n="Hdt. 6.46">Hdt. 6.46</bibl>, <bibl n="Hdt. 7.109">Hdt. 7.109</bibl>, <bibl n="Hdt. 7.120">Hdt. 7.120</bibl>, <bibl n="Hdt. 7.126">Hdt. 7.126</bibl></item><item>founded at grave of Abderus: <bibl n="Apollod. 2.5.7">Apollod. 2.5.7</bibl></item><item>Xerxes' first halt in his flight: <bibl n="Hdt. 8.120">Hdt. 8.120</bibl></item></list></div3></div2></div1>
Digital Reference Materials
Machine readable dictionaries <entryFree id="n3709" key="a)krwth/rion" type="main"><orth
extent="full" lang="greek">a)krwth/rion</orth>, <genlang="greek">to/</gen>, (<etym lang="greek">a)/kros</etym>)<sense id="n3709.0" n="A" level="1"><tr>topmost</tr> or <tr>prominent part</tr>, <foreign lang="greek">a). tou= ou)/reos</foreign> mountain <tr>peak</tr>, <bibl n="Perseus:abo:tlg,0016,001:7:217"><author>Hdt.</author><biblScope>7.217</biblScope></bibl>
General Encyclopedias
A Research Library Based on the Historical Collections of the Internet Archive
William Y. Arms, Selcuk Aya, Pavel DmitrievComputer Science Department, Cornell University
Blazej KotInformation Science, Cornell University
Ruth Mitchell, Lucia WalleCornell Theory Center, Cornell UniversityD-Lib Magazine
February 2006
Volume 12 Number 2
ISSN 1082-9873
http://www.dlib.org/dlib/february06/arms/02arms.html
Main Idea of Article
Academic researchers have to comb through collections of libraries, museums, and archives to analyze and synthesize the information buried within them.
A Web Library for Social Science Research
Idea is to replace much of the tedious manual effort with computer programs that act as their agents.
challenge was to organize the materials and provide powerful, intuitive tools that will make a huge collection of semi-structured data accessible to researchers, without demanding high levels of computing expertise.
Questions?
Thank You
top related