Top Banner
Greenstone Digital Library Software: An Overview Imran Mansuri Project Assistant (Library Science) INFLIBNET Centre 1 7 March 2011 Prepared by Imran Mansuri
33

Mlisc gsdl

Nov 18, 2014

Download

Education

Imran Mansuri

Greenstone Digital Library
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Mlisc gsdl

Greenstone Digital Library Software:An Overview

Imran MansuriProject Assistant (Library Science)

INFLIBNET Centre17 March 2011 Prepared by Imran Mansuri

Page 2: Mlisc gsdl

Agenda

• Introduction : Digital Library Software (DL)• Greenstone Digital Library Software (GSDL)• Introduction• History• Versions• Features• Unique Features• Technology used• Example Sites• Example Collections

27 March 2011 Prepared by Imran Mansuri

Page 3: Mlisc gsdl

Digital Library Software• The term “Digital Library” refers to a library in

which collections are stored in digital formats(as opposed to print, microform, or othermedia) and accessible by computers

• The digital content may be stored locally oraccessed remotely via computer networks

• Access the books, images are in digital format• Using Net access to information from

anywhere37 March 2011 Prepared by Imran Mansuri

Page 4: Mlisc gsdl

Digital Libraries : Features

Dynamic Electronic Information Systems

Increase Portability

Efficiency of Access

Flexibility

Availability

47 March 2011 Prepared by Imran Mansuri

Page 5: Mlisc gsdl

Digital Library Software

Dspace

Fedora

Eprints

Resource Space

Greenstone

57 March 2011 Prepared by Imran Mansuri

Page 6: Mlisc gsdl

Greenstone Digital Library Software• The Greenstone Digital Library Software (GSDL)

provides a way of building and distributing digitallibrary collections, opening up new possibilitiesfor organizing information and making it availableover the Internet or on CD-ROM

• Developed by the New Zealand Digital LibraryProject (www.nzdl.org) at the University ofWaikato

• Distributed in co-operation with UNESCO andHumanities Library Project, Romania

67 March 2011 Prepared by Imran Mansuri

Page 7: Mlisc gsdl

GSDL : Some Facts

• Current version: 2.82 and 3.03 Available from http://www.greenstone.org• Software suite for building, maintaining, and

distributing digital library collections• Comprehensive, open-source• Distribution and promotion partners:o UNESCOo Human Info NGO, Belgium

77 March 2011 Prepared by Imran Mansuri

Page 8: Mlisc gsdl

GSDL : History1995 - Digital library of Computer Science

Technical Reports. Its established by New Zealand Digital Library

1997 - Decision to use the GPL (General Public License ); name : Greenstone adopted ; Work with Human Info NGO to produce humanitarian CD-ROMs

1998 Apr - First CD-ROM collection released: Humanity Development Library

1998 Aug - Greenstone.org website established1999 BBC - Collection established

87 March 2011 Prepared by Imran Mansuri

Page 9: Mlisc gsdl

2000 Apr - Greenstone mailing list startedAug - Formally established cooperative effort with

UNESCO and Human Info NGONov - Distribute software on SourceForge2002 Apr - Development of Greenstone3Mar - Official opening of the Niupepa collection,

development of the Greenstone Librarian Interface

Jun - First UNESCO Greenstone CD-ROM

97 March 2011 Prepared by Imran Mansuri

Page 10: Mlisc gsdl

2003 - A Java development that became known as the Greenstone Librarian Interface2005 Nov - Initial release of Greenstone32006 Apr - Greenstone Support Group for

South Asia launched

107 March 2011 Prepared by Imran Mansuri

Page 11: Mlisc gsdl

GSDL : Version 2000 Feb - gsdl 2.12 Apr - gsdl 2.21 Dec - gsdl 2.30 2001 Feb – gsdl 2.31 2002 Jan – gsdl 2.38 2003 Jun - gsdl 2.40 2004 Feb – gsdl 2.50 2005 Apr – gsdl 2.60 and in November - gsdl 3.00 2006 Mar – gsdl 2.70 2007 Apr – gsdl 2.80 2008 gsdl 3.03 Current release gsdl 2.82

117 March 2011 Prepared by Imran Mansuri

Page 12: Mlisc gsdl

GSDL : Features

Multi S/W PlatformMulti Lingual Support Structured Metadata in XML using DCMetadata Extraction Plug-ins for Documents Full-text mirroring Text Level Penetration Concurrent & Dynamic Content Development Uniform Presentation

127 March 2011 Prepared by Imran Mansuri

Page 13: Mlisc gsdl

Collection Building

• Web and command line mode• Input collections:• GSDL server (files)• Remote (FTP - files, HTTP – website pages)• Collection input: batch mode, NOT interactive• Document formats: HTML, PDF, Text, Word• (Doc, RTF), PS, e-mail, bibliographic

137 March 2011 Prepared by Imran Mansuri

Page 14: Mlisc gsdl

• Support for full text tagging for hierarchicaldocument browsing

• Automatic text extraction and indexing‘Plugins’ for different document formats(HTMLPlug, PDFPlug, etc.) May fail for some

documents!XML representation – conversion to HTML forDisplayNative document format – storage and display (viabrowser plugins, helper applications)

• Data compression support

147 March 2011 Prepared by Imran Mansuri

Page 15: Mlisc gsdl

• MetadataAutomatic extraction of simple metadata

(e.g. Title, date)Explicit metadata via ‘Classifiers’

Hierarchical (e.g. Subject)List (e.g. Organization, Author)

Used for browsing and field-based searchingMulti-language support via Unicode

157 March 2011 Prepared by Imran Mansuri

Page 16: Mlisc gsdl

Collection Browse and Search

• Full text search• Metadata (field) search and browse• Boolean• Ranked• Multi-language support for browse/

search interface• Search history, search term• highlighting…

167 March 2011 Prepared by Imran Mansuri

Page 17: Mlisc gsdl

Collection Presentation

• Search results formattingFormat strings in the configuration file

• Home page customizationUsing macros

177 March 2011 Prepared by Imran Mansuri

Page 18: Mlisc gsdl

GSDL : Features

Easy Installation Easy Maintenance Hierarchy Structure Interface Customization

– Front Page Design, Header for the DigitalLibrary, Collection Icon, Cover Images

Collection Configuration (Collect.cfg) File Scalability, Flexibility

187 March 2011 Prepared by Imran Mansuri

Page 19: Mlisc gsdl

Collection Distribution

• Web• CD-ROM Publish created collections to the CD-ROM Windows only Two possibilities:o Install GSDL software to HDD and access

content on CDo Run GSDL search engine out of the CD!

197 March 2011 Prepared by Imran Mansuri

Page 20: Mlisc gsdl

GSDL : Unique Features

Incremental Collection Building Content Development in 3 different ways Good Documentation and Active Mailing

List Variety of Plug-ins for different document

Types Publishing on CD-ROMs Data Compression

207 March 2011 Prepared by Imran Mansuri

Page 21: Mlisc gsdl

GSDL : Technology Used

• Technology used in the current version– Java 1.6 (Higher)– Image Magic– Application Server : Apache 2.2– GSDL_Linux 2.82 and Win

217 March 2011 Prepared by Imran Mansuri

Page 22: Mlisc gsdl

GSDL : Example Sites

India: Archives of Indian Labour

227 March 2011 Prepared by Imran Mansuri

Page 23: Mlisc gsdl

United States: New York Botanical Garden

237 March 2011 Prepared by Imran Mansuri

Page 24: Mlisc gsdl

International: Global Library Services Network

247 March 2011 Prepared by Imran Mansuri

Page 25: Mlisc gsdl

257 March 2011 Prepared by Imran Mansuri

Page 26: Mlisc gsdl

267 March 2011 Prepared by Imran Mansuri

Page 27: Mlisc gsdl

277 March 2011 Prepared by Imran Mansuri

Page 28: Mlisc gsdl

Some ObservationsStrengths: Configurability: content extraction for indexing,

presentation layout, metadata for browsing and field-based searching (little difficult though!)

Extensibility:Plugins for content extraction, Unicode for

multilanguage support, source code availability Fulltext search on variety of document formats XML, Unicode, Dublin Core support Data compression CD-ROM publishing

287 March 2011 Prepared by Imran Mansuri

Page 29: Mlisc gsdl

Limitations:

Interactive content updating and management not possible

No duplicate identification Metadata handling appears to be little complex Linux version seems to be more robust than

WindowsHangs while processing some documents during

collection building – no way to gracefully handle this

297 March 2011 Prepared by Imran Mansuri

Page 30: Mlisc gsdl

Current Status

Strong development work – CS department at University of Waikato, NZ Z39.50 experimental interface now available Promoted by UNESCO Beginning to be used worldwide Can be

expected to reach CDS/ISIS like popularity (particularly in developing countries)

307 March 2011 Prepared by Imran Mansuri

Page 31: Mlisc gsdl

Documentation and Help

• Available at: http://www.greenstone.org– Software– Demo collections– FAQ– Tutorial materials• Documentation: Installer’s Guide, User’s Guide, Developer’sGuide,

and other reading materials

317 March 2011 Prepared by Imran Mansuri

Page 32: Mlisc gsdl

• Mailing lists:– Greenstone Users List– Greenstone Developers List

• Greenstone Documentation Wiki

http://wiki.greenstone.org/wiki/index.php/GreenstoneWiki

327 March 2011 Prepared by Imran Mansuri

Page 33: Mlisc gsdl

337 March 2011 Prepared by Imran Mansuri