1 The Vietnam Center and Archive Stephen Maxner, Ph.D. DirectorSteve.maxner@ttu.edu.

Post on 23-Dec-2015

213 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

Transcript

1

The Vietnam Center and ArchiveThe Vietnam Center and ArchiveStephen Maxner, Ph.D.Stephen Maxner, Ph.D.Director Director Steve.maxner@ttu.eduSteve.maxner@ttu.edu

2

SARBICA EXECUTIVE MEETING& SEMINAR

“Managing and providing online access to digital collections”

October 1, 2009

3

Presentation Outline Brief overview of the Virtual Vietnam Archive Brief overview of the Virtual Vietnam Archive

at Texas Tech Universityat Texas Tech University How we planned the Virtual ArchiveHow we planned the Virtual Archive How we select collections and materials for How we select collections and materials for

digitizationdigitization How we digitize and organize informationHow we digitize and organize information How we deliver online digital materialsHow we deliver online digital materials

4

Overview: The Virtual Vietnam Archive Created in 2001Created in 2001

Contains over 3 million pages of digitized historical Contains over 3 million pages of digitized historical texts and materials.texts and materials.

Documents, photographs, slides, oral histories, Documents, photographs, slides, oral histories, artifacts, films and moving images, audio recordings, artifacts, films and moving images, audio recordings, maps, finding aids, etc…maps, finding aids, etc…

We add approximately 15,000 pages/monthWe add approximately 15,000 pages/month Available free to researchers worldwide through Available free to researchers worldwide through

Internet.Internet.

5

How we planned the Virtual Vietnam Archive

Defined mission, objectives, and purposeDefined mission, objectives, and purpose Researched software and equipmentResearched software and equipment Developed best practices, policies and Developed best practices, policies and

procedures for digital projectprocedures for digital project Provide and manage online materials and Provide and manage online materials and

accessaccess

6

Mission and Objectives of Virtual Vietnam

Archive Increase user access to archive collections.Increase user access to archive collections. Increase control over archive collections and Increase control over archive collections and

the information they contain.the information they contain. Provide alternate user-friendly format for at-Provide alternate user-friendly format for at-

risk, inaccessible, and fragile materials (assist risk, inaccessible, and fragile materials (assist with material conservation and preservation).with material conservation and preservation).

Provide innovative user environment.Provide innovative user environment.

7

Selecting Software and Equipment SoftwareSoftware

• Data Management and User AccessData Management and User Access• Digital Capture and processingDigital Capture and processing

Computers, Networks, and StorageComputers, Networks, and Storage Digitization EquipmentDigitization Equipment

8

Software – Archive Management Cuadra Star Archives Management SoftwareCuadra Star Archives Management Software

• Inverse Relational DatabaseInverse Relational Database• Unlimited data entry fieldsUnlimited data entry fields• Cross-platform and cross-database searchCross-platform and cross-database search• Excellent web-interface for user accessExcellent web-interface for user access• Fully customizable and scalableFully customizable and scalable

9

Software - Digital Capture Software Adobe Acrobat - DocumentsAdobe Acrobat - Documents

Adobe Capture - Document textAdobe Capture - Document text Adobe Photoshop - ImagesAdobe Photoshop - Images Cool Edit Pro (Adobe Audition) - AudioCool Edit Pro (Adobe Audition) - Audio Pinnacle Studio - VideoPinnacle Studio - Video Windows Media Encoder – Streaming VideoWindows Media Encoder – Streaming Video

10

Hardware and Equipment ServersServers StorageStorage Tape Backup SystemsTape Backup Systems Desktop computersDesktop computers Desktop ScannersDesktop Scanners Specialized Digital Scanners and Conversion Specialized Digital Scanners and Conversion

SystemsSystems

11

Servers and Digital File Storage

Dell PowerEdge Servers, Multi-Processor, RAID 5Dell PowerEdge Servers, Multi-Processor, RAID 5Storage Area Network (SAN) (60 Terabytes)Storage Area Network (SAN) (60 Terabytes)

12

Dell PowerVault Tape Library

13

Document and ManuscriptsDocument and Manuscripts

Fujitsu fi4220cFujitsu fi4530c

14

Maps and Large Format ItemsMaps and Large Format Items

HP Designjet 815mfp

Epson Expression 10000XL

15

Best Practices, Policies and Procedures

Collection assessment, selection, handling, and Collection assessment, selection, handling, and scanningscanning

Digital file formats and file qualityDigital file formats and file quality Naming conventionsNaming conventions Data and Metadata captureData and Metadata capture Information and Digital File SecurityInformation and Digital File Security Online Access SystemOnline Access System

16

Evaluate Collections

Evaluate content and usefulness to Evaluate content and usefulness to researchersresearchers

Determine which collections will translate well Determine which collections will translate well to an electronic environment.to an electronic environment.

Prioritize: collections of high research or Prioritize: collections of high research or historical value go at the top of the list.historical value go at the top of the list.

17

Consider Physical Condition Will handling materials during digitization Will handling materials during digitization

process cause damage?process cause damage? Will digitization prevent further damage?Will digitization prevent further damage? Choose “archive-friendly” scannersChoose “archive-friendly” scanners Train scanning staff to properly handle archive Train scanning staff to properly handle archive

documents documents

18

Digital File Formats and File Quality Balance digital archive file storage and online access Balance digital archive file storage and online access

copiescopies Documents and Manuscripts – PDFDocuments and Manuscripts – PDF

300 – 600 DPI (100 DPI for online)300 – 600 DPI (100 DPI for online) Images – TIFF / JPGImages – TIFF / JPG

300 – 600 DPI (100 DPI for online)300 – 600 DPI (100 DPI for online) Moving Images – AVI (30fps)Moving Images – AVI (30fps)

online WMV (256 kbps)online WMV (256 kbps) Audio – WAV (44100 Hz)Audio – WAV (44100 Hz)

online MP3 (128 kbps)online MP3 (128 kbps)

19

Naming Convention

PDF document is saved as a 10 digit numberPDF document is saved as a 10 digit number For example: 2123206063For example: 2123206063 212 = Collection Number212 = Collection Number 32 = Box Number from the Collection32 = Box Number from the Collection 06 = Folder Number from the Box06 = Folder Number from the Box 063 = Document Number from the Folder (63063 = Document Number from the Folder (63rdrd document document

in the folder)in the folder) Retains original order of collection and allows researchers Retains original order of collection and allows researchers

to view electronic files as if in archiveto view electronic files as if in archive

20

Metadata Collected Develop Metadata Collection Develop Metadata Collection

Plan to ensure consistencyPlan to ensure consistency Document TitleDocument Title Item NumberItem Number Number of PagesNumber of Pages Collection NameCollection Name AuthorAuthor Copyright StatusCopyright Status Document LanguageDocument Language

21

Metadata Collected Document DateDocument Date

Document Date RangeDocument Date Range Subject Terms/KeywordsSubject Terms/Keywords Document Full TextDocument Full Text Document ConditionDocument Condition Physical Location of the Physical Location of the

CollectionCollection Record Creation and History Record Creation and History

of Updatesof Updates

22

Paper Capture (OCR) OCR is used to capture the words from the OCR is used to capture the words from the

document so the words can copied and then document so the words can copied and then pasted into the database to become search terms. pasted into the database to become search terms.

After the Paper Capture is pasted into the database After the Paper Capture is pasted into the database it is not saved because it would make the file size it is not saved because it would make the file size too big.too big.

Paper Capture is also not saved because it can Paper Capture is also not saved because it can sometimes distort the image online.sometimes distort the image online.

23

Information Security: Copyright Documents that are not copyrighted can be Documents that are not copyrighted can be

seen online and downloadedseen online and downloaded

Documents that are copyrighted can be Documents that are copyrighted can be digitized but can only be viewed on site via digitized but can only be viewed on site via INTRANET – not the Internet (per Digital INTRANET – not the Internet (per Digital Millennium Copyright Act)Millennium Copyright Act)

24

Digital Information SecurityDigital Backup Systems

Redundant SystemsRedundant Systems Online Storage – on servers and SANOnline Storage – on servers and SAN Magnetic Tapes – SDLT 1 (160/320GB)Magnetic Tapes – SDLT 1 (160/320GB)

Nightly BackupsNightly Backups Weekly BackupsWeekly Backups Offsite StorageOffsite Storage

Archival (dual layer acrylic) Gold CDs and DVDsArchival (dual layer acrylic) Gold CDs and DVDs

25

Online Access: Simple Search

26

Advanced Search Allows for multiple keywordsAllows for multiple keywords Document and Collection TitleDocument and Collection Title Media formatMedia format Date and date spanDate and date span LanguageLanguage Date when placed onlineDate when placed online Folder and Box view of collectionFolder and Box view of collection

27

Advanced Search

28

Online Summary Information

29

Online Summary Information Results

30

Search Results

31

Benefits of Digital Online Archives Researcher access: 3 million pages online in Researcher access: 3 million pages online in

Virtual Vietnam ArchiveVirtual Vietnam Archive Researcher use: Virtual Vietnam Archive hosts Researcher use: Virtual Vietnam Archive hosts

more than 1.5 million online search more than 1.5 million online search sessions/year with downloads of approximately sessions/year with downloads of approximately 2 million documents/year 2 million documents/year

Conservation and Preservation – fewer Conservation and Preservation – fewer researchers requesting to handle actual researchers requesting to handle actual documents. Less Handling = Conservationdocuments. Less Handling = Conservation

32

Thank you very much!Thank you very much!Stephen Maxner, Ph.D.Stephen Maxner, Ph.D.DirectorDirectorSteve.maxner@ttu.eduSteve.maxner@ttu.edu

top related