Lessons Learned: Digitization of Cooke County Ledgers University of North Texas Libraries Contributors: Trista Barker Digital Imaging Technician-Microfilm Reyes Berrios Graduate Library Assistant Sarah Lynn Fisher Project Coordinator, Oklahoma Historical Newspapers Ana Krahmer Coordinator, Digital Newspaper Program Hannah Tarver Department Head, Digital Projects Unit
24
Embed
Lessons Learned: Digitization of Cooke County Ledgers/67531... · Lessons Learned: Digitization of Cooke County Ledgers University of North Texas Libraries Contributors: Trista Barker
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Lessons Learned: Digitization of Cooke County Ledgers
University of North Texas Libraries
Contributors:
Trista Barker Digital Imaging Technician-Microfilm
Reyes Berrios Graduate Library Assistant
Sarah Lynn Fisher Project Coordinator, Oklahoma Historical Newspapers
Ana Krahmer Coordinator, Digital Newspaper Program
Hannah Tarver Department Head, Digital Projects Unit
Lessons Learned – 2
Table of Contents
I. Introduction ............................................................................................................................................... 3
II. Workflow ................................................................................................................................................... 3
III. Evaluation of Materials/Vendor Communication .................................................................................... 5
IV. Microfilm Management ........................................................................................................................... 6
V. Supplementary Scanning of Ledgers ........................................................................................................ 9
VI. Digital File Management .......................................................................................................................... 9
Quality Control ........................................................................................................................................ 10
VII. Metadata and Upload ........................................................................................................................... 10
VIII. Recommendations for Future Projects ................................................................................................ 12
Evaluating and Scanning Microfilm ......................................................................................................... 12
Suggestions for Future Projects .............................................................................................................. 13
IX. Conclusion .............................................................................................................................................. 14
Appendix A – Original Inventory of Ledgers ............................................................................................... 17
Appendix B – List of Microfilming Specifications ........................................................................................ 20
Appendix C – Digital Projects Unit Scanning Standards .............................................................................. 21
Appendix D – Spreadsheet to Track Scanning, In Progress ......................................................................... 22
Appendix E – Spreadsheet to Track Scanning, Complete ........................................................................... 23
Appendix F – Field Values Used in the Super-Template Metadata Record ................................................ 24
Lessons Learned – 3
I. Introduction
In December 2010, the University of North Texas (UNT) Archives received a grant from the National
Historical Publications and Records Commission (NHPRC) to digitize and host several rare and unique
collections representative of the Civil War era and its effect in Texas. The project, titled “The Civil War
and its Aftermath: Diverse Perspectives,” contains a variety of historic items including letters, diaries,
military records, legal documents, and court ledgers from Cooke County, Texas.
The UNT Archives partnered with the UNT Libraries Digital Projects Unit (DPU), which managed all stages
of the digitization. Most of the items were digitized on flatbed scanners; however, due to the size of the
ledgers, it was determined that the best way to digitize them would be to have them microfilmed and
then to scan the microfilm instead of handling the full ledgers. In this case, scanning from microfilm
rather than the originals was also faster and more cost-effective, and it provided the added benefit of
creating a preservation microfilm master of each ledger. This paper describes and examines the process
the DPU implemented to digitize the Cooke County ledger collection; in doing so, it provides insight into
the problems one might encounter, as well as recommendations for institutions that may be considering
similar digital projects.
II. Workflow
For this project, the DPU combined workflows previously established and regularly implemented in the
process of digitizing rare bound volumes and newspapers from print materials. The first half of the
workflow below conforms to DPU standards for microfilm scanning (see Appendix C – Digital Projects
Unit Scanning Standards). Once the digital images were created from microfilm, the covers and any
additional scans were made from the print copies using a Zeutschel planetary scanner that can
accommodate large and fragile materials. The digital objects were then organized using an in-house file
naming system implemented by the DPU for journals, books, and other types of bound volumes.
Lessons Learned – 4
Metadata was created using an XML “super-template,” and records were completed by students in-
house.
Overview of the workflow:
1. Inventory ledgers: UNT Archives compiled an inventory list of the ledgers that confirmed the
contents of the collection. The DPU used this inventory at various stages in the workflow to
identify and track the physical materials and digital objects.
2. Communicate with microfilm creator (vendor): After consulting the inventory, the DPU
communicated the specifications and requirements for filming to the vendor.
3. Deliver ledgers to vendor: UNT Archives organized transportation of the materials to the
vendor by car. After receiving the ledgers, the vendor created microfilm from the physical
materials.
4. Retrieve ledgers and reels after filming: When microfilm creation was complete, the vendor
shipped the newly-created microfilm to the DPU. The vendor delivered the microfilm masters
along with a second-generation (2N) negative duplicate of each reel. At a later date, UNT
Archives organized the retrieval of the ledgers from the vendor.
5. Scan microfilm: Microfilm scanning took place in the Digital Projects Newspaper Unit on a Mekel
Mach V microfilm scanner, which can scan one full reel of microfilm in under 30 minutes.
6. Split/process page scans: The scanning technician split the captured images into orderly pages
and processed them using software to optimize image quality.
7. Quality control (QC) check images and track missing pages: The Digital Projects Newspaper Unit
performed a QC check to ensure that the digital files faithfully represented the physical items.
8. Re-scan covers, full-color pages, and pages missed during filming: In addition to adding missing
images discovered during QC, full-color scans were made to supplement the grayscale microfilm
images and provide attractive thumbnails for online viewing.
Lessons Learned – 5
9. Magick-number/organize files for online viewing: DPU staff changed file names to provide
local, unique identifiers to individual scanned image files in preparation for the online
environment.
10. Make note of ledger heights for metadata: This information was recorded for inclusion in the
physical description section of the metadata record.
11. Create super-template file: A DPU metadata specialist created a generic metadata record
containing information that applied to the entire ledger collection.
12. Return ledgers to storage: The UNT Archives prepared the ledgers for long-term storage.
13. Upload digital collection: DPU staff uploaded the digital files to the server for The Portal to
Texas History (http://texashistory.unt.edu); at this time, the ledgers were hidden from the
public pending completed metadata records.
14. After upload, create metadata records: Students completed metadata records for individual
items within the collection, adding unique, item-level information to the fields missing from the
super-template record.
III. Evaluation of Materials/Vendor Communication
Employees of the DPU evaluated the physical collection prior to microfilming. Data they collected
included the number of volumes along with their approximate size, page count, and condition. Two
employees of the UNT Archives, Perri Hamilton and Richard Himmel, also created a document noting the
exact title, size, and page count of each ledger (See Appendix A – Original Inventory of Ledgers). This
information was used to determine the filming requirements (See Appendix B – List of Microfilming
Specifications). For example, many of the ledgers had pages with preprinted information but no
handwritten notations. In order to faithfully represent the original volume as a digital object, UNT
requested that all of the pages should be filmed, regardless of their content. UNT also requested the
smallest possible reduction ratio to generate a higher resolution for the digital images. Since the ledgers
Lessons Learned – 6
were bound, they required 2B filming orientation: both pages of the open book comprise one captured
image on the microfilm and are positioned with the text running horizontally across the film (see Figure
1).
Figure 1. 2B filming orientation: the open bound volume is turned horizontally to capture both open pages as one image.
Based on previous collaboration, the DPU chose the Oklahoma Historical Society (OHS) as the vendor to
create the microfilm of the ledger collection. UNT communicated the filming requirements and the
ledger inventory to the vendor via email prior to microfilming. OHS confirmed the filming standards
prior to receiving the materials and filming the ledgers. The UNT Archives organized the transport of the
physical ledgers, delivered to the OHS, located in Oklahoma City, Oklahoma, to be filmed.
IV. Microfilm Management
Microfilm Evaluation
When the microfilm creation was complete, the Digital Projects Newspaper Unit assigned a unique
number to each reel and evaluated the microfilm, tracking data about the reels on microfilm evaluation
sheets. Information including the title, bar code or reel number, page count, reduction ratio, and original
size of the object was recorded on the sheets (see Figure 2).
Top of page Top of page
Lessons Learned – 7
Figure 2. Blank example of a reel information sheet used for microfilm evaluation.
Staff members also compared the page count from the microfilm to the page count on the pre-filming
inventory and noticed that pages had been skipped during microfilming. This highlights the merit of
reviewing microfilm prior to scanning. Although staff members were not able to compare the microfilm
to the original ledgers at that time, it appeared that most ledgers were not missing a significant number
of pages.
Since the microfilming process does not always allow for quick turnaround, DPU staff had to consider
the time that it would take to have items re-filmed to include the missing pages versus the time that
would be spent to rescan individual pages from the print items if they were not re-filmed. Staff decided
that the number of missing pages did not justify re-filming the collection in its entirety; however, OHS
Lessons Learned – 8
re-filmed two of the ledgers completely. For those re-filmed ledges, the number of missing pages on the
microfilm for those objects exceeded the acceptable limit staff had chosen for this project, roughly 25 or
more missing pages.
Microfilm Scanning
After the evaluation of the microfilm, the reels were turned over to the department’s microfilm imaging
technician who used the reel information sheets to prepare items for scanning. The technician
employed the reduction ratio noted on the information sheets to reproduce the images at the proper
pixels per inch (ppi), and she used the reel identification numbers to arrange the file and directory
structures for the digital images. Once the technician created these preliminary settings, she scanned
the microfilm using Quantum Scan software.
After each reel was scanned, the technician used a program called Quantum Process to split the two
pages on each image into individual page files. Adjustments to the contrast and background were made
within the software settings to ensure the best web presentation outcomes. The scans then went
through a final processing stage with Prizm Gray software, which de-skews (straightens) each image.
At this point, the scanning and processing stages were complete, and the individual page files were
separated into folders made for each book, to be handled as individual digital objects. A student
assistant separated the digital files, putting the images for each book into unique folders within the
parent folder for the microfilm reel. The student then checked each folder against the reel information
sheet for correct titles, dates, and volume numbers. Images for every ledger underwent a visual quality
control check to determine if files were duplicated, blurred, or missing pages. If pages were duplicated,
the best image was used and the other deleted; blurred or missing pages were noted, and rescans were
made at the next stage in the workflow.
Lessons Learned – 9
V. Supplementary Scanning of Ledgers
At the start of the project, DPU staff determined that the covers of the ledgers should be scanned in
color on the Zeutschel OS 10000 planetary scanner to create more aesthetic thumbnails and to provide
a realistic representation of the originals. When staff members discovered that some pages had not
been filmed by the vendor, those pages were added to the queue to be scanned on the Zeutschel
scanner, along with any pages that had not been filmed at a high enough quality to meet DPU standards.
The supplementary scanning portion of the project was managed by a DPU Graduate Library Assistant
(GLA). Before scanning the ledgers, he evaluated the information contained in the reel information
sheets and inspected the condition of each ledger. Due to the age of the ledgers, many had deteriorated
pages and bindings that required careful manipulation while scanning to prevent further damage. One
precaution to address the fragility of the ledgers was to temporarily lock the scanner glass upright rather
than allowing it to rest on top of the pages as usual. The GLA followed DPU scanning standards in
scanning all covers and missing pages (see Appendix C – Digital Projects Unit Scanning Standards).
VI. Digital File Management
File Organization
After completion of all scanning, the digital files were renamed using ACDSee 9 Photo Manager
software. To process paginated books or items that have many pages, the DPU uses a standard file-
naming convention (termed “magick-numbering”) that orders the pages for display online. Each file
number contains eight digits: the first four digits represent the sequence of the files in the original
ledger (this includes the covers and pages without numbers); the next four digits indicate the
pagination, if there are page numbers printed in the ledger. For example, the file name “03070301”
designates the 307th image in the sequence of the book, which has the printed page number of “301” on
the physical page. In cases where the ledger contained letters or items glued to a page, a first scan was
taken of the entire page including any attached items. Unless items were glued entirely to the page, a
Lessons Learned – 10
subsequent scan was taken of the back of each inserted item. In such cases, an alphabetic letter was
assigned to the end of the filename (e.g., 03070301a and 03080301b).
Quality Control
During the supplementary scanning process, the GLA also performed a quality control inspection of each
digital image against the physical ledger pages to ensure that all the pages were properly digitized; this
QC portion of the process included pictures and letters contained in the ledger. The inspection step
revealed that yet more pages than originally expected were omitted, including pages without manual
entries as well as several pages that contained text. All omitted pages were then scanned, and their
images were added in the proper sequence to the item’s folder.
For ledgers that did not have printed page numbers, it was necessary to inspect each digitized image
against the physical object even more carefully. The digital files of those ledgers were named with the
sequence number only, rather than a full magick-number (e.g., 0307).
In some cases, objects inserted in the physical ledger were removed and microfilmed at the end of the
book. This meant that the digital image files were not in the same order as the print copy and their
contextual relevance to the event they represented was lost. Those items were rescanned and
numbered according to their place and sequence inside the ledger as appropriate.
VII. Metadata and Upload
Once the file scanning was complete, the DPU department head created a metadata super-template file,
which is a generic XML metadata record that contains only elements common to all items in a collection
(see Appendix F—Field Values Used in the Super-Template Metadata Record). As part of the upload
process, the system created a blank record for each ledger and propagated fields with the pre-filled,
commonly-shared information from the super-template record. In this case, super-templates were used
to load the files into the online system as quickly as possible since the processing and upload stage can
Lessons Learned – 11
be time-consuming for items as large as the ledgers. Uploading the files ensured that individuals working
on the project from other departments could have access to the records if the metadata needed to be
completed more quickly than DPU staff could accommodate since the DPU was primarily responsible for
the digitization aspects of the grant project. Members from all departments involved in the Civil War
grant project had access to the metadata editing interface (see Figure 3, below).
Figure 3. Editing metadata of an individual item via a web browser interface.
After the ledgers were uploaded and accessible on the metadata record editing system, DPU assistants
followed collection-specific guidelines to create records for each individual ledger. For example, the title
of each ledger was created according to the same formula: [Kind of ledger, Court, Cooke County, dates],
e.g., [Bar Docket and Appearances, Civil and Criminal District Court, Cooke County, 1871-1873]. The
students then made each ledger publicly accessible on the Portal as the record was completed.
Lessons Learned – 12
VIII. Recommendations for Future Projects
To support future endeavors, the following section offers observations and suggestions based on an
analysis of the DPU’s experience in managing the Cooke County ledger digitization.
Project Coordination
Projects with many pieces that take place in multiple stages require a clear chain of command. The
Project Manager of this particular grant, responsible for overseeing all of the grant work and keeping the
project on track, retired soon after the work began. To complete the work within the allotted time,
responsibilities had to be delegated across several people with differing skillsets, though no individual
was able to take full leadership of the project. This project required collaboration within the department
(i.e., between the Digital Projects Lab and the Digital Projects Newspaper Unit) as well as coordination
with members of other divisions in the Libraries, who were tasked with transporting the physical ledgers
and reporting on the progress of digitization. These factors led to some communication and workflow
wrinkles that might have been prevented had one leader been able to serve as a hub of communication.
Evaluating and Scanning Microfilm
Information on the microfilm reel information sheets did not always appear to match the physical
ledgers and scanned digital files. This problem was rooted in two places: first, the physical objects that
had been microfilmed by the vendor were not available to the DPU to compare to the microfilmed
version during microfilm evaluation; second, though it was communicated and confirmed with the
microfilm vendor that the entirety of every volume should be filmed (including blank pages), dozens of
blank pages as well as some with content were missing. The team also requested that the vendor
provide a small reduction ration in the filming process, but the images on the microfilm turned out to be
much smaller than what the digital team would consider standard and most useful.
To manage the problems that simply could not be resolved, such as missing pages in the microfilm, we
used worksheets to track items as they were being processed so that persons working on the grant
could follow the status of the items (see Appendix D – Spreadsheet to Track Scanning, In Progress and
Appendix E – Spreadsheet to Track Scanning, Complete). Although the regular DPU workflow does not
Lessons Learned – 13
usually incorporate spreadsheets, they were used in this case as a way to track progress and minimize
the impact of problems.
Suggestions for Future Projects
Although there were several challenges involved in keeping the project on track, the DPU was able to
finish the work according to regular Lab standards well within the deadlines for the grant project. Based
on the lessons learned from completing this project, here are some important considerations for other
groups undertaking projects of this kind:
Organize the project at the start: Make sure that several people are involved in the planning process – if
the original project manager is unavailable, there should be a secondary person in place to take up
coordination of the project in order to ensure that everything continues smoothly.
Evaluate materials carefully: Whenever there are materials to be microfilmed, ensure that the physical
objects have been thoroughly evaluated to consider all of the possible situations that may be
encountered in microfilming. These may include: loose items inserted in the books, folded pages,
partially-attached items on pages, and multiple items inserted or attached to the same pages.
Allow sufficient time: Whenever possible, make sure to include plenty of extra time in the workflow.
At any stage, a minor problem could cost significant amounts of time to solve. For example,
misunderstandings about requested microfilming standards for the project could require some
items to be re-filmed. Also remember that the workflow and timeline should take the physical
objects into account. Special transport or handling of physical items can require more time than
expected.
Communicate clearly with vendors: Having information available and verified in multiple places can
help to prevent misunderstandings. Include a hard copy of filming requirements and other relevant
documents with the physical objects when they are transferred to the vendor.
Lessons Learned – 14
Centralize information: Whenever possible, keep copies of all relevant information in a place that is
easy for all project members to access, such as a project management site or internal wiki page.
Have a plan for microfilm evaluation: It is best to have the original object available when evaluating the
microfilm; however, if this is not possible, ask the vendor to note when the object is
color/grayscale/black-and-white so that you can determine if the microfilm meets the agreed-upon
standards.
Conduct multiple inventories: To make sure that no physical or digital items are lost, check them
against the original inventory at multiple stages. Creating an accurate initial inventory will prevent
problems later.
Log progress: Keeping track of all stages of a project with this many steps can be crucial. It is also
important to have a centralized place to track problems and progress in order to handle similar
issues in the future and to facilitate preparation of reports for granting agencies.
IX. Conclusion
Under the requirements of the grant the DPU successfully completed the digitization of the items
and associated metadata and is now hosting the ledgers in The Portal to Texas History’s collection “The
Civil War and its Aftermath: Diverse Perspectives”
(http://texashistory.unt.edu/explore/collections/CWADP/browse/). Traffic statistics about individual
items within the series “[Cooke county records, 1857-1950]” display user visits and referrals (see Figure